San Francisco's city technology offices are sitting on a problem that sounds mundane but carries a real price tag: thousands of duplicate images embedded in public-facing portals, internal case-management platforms, and agency databases, many of them loaded twice — or more — because of inconsistent upload protocols and aging content-management software. The Department of Technology, which oversees the city's enterprise infrastructure from its offices at 1 South Van Ness Avenue, flagged the issue in its current fiscal-year operational review, calling redundant digital assets one of the leading contributors to unnecessary cloud-storage expenditure.
The timing matters. San Francisco's tech modernization push is accelerating as the city tries to claw back credibility after years of high-profile failures on homelessness data tracking, BART integration projects, and the ill-fated HealthySF digital enrollment portal. Every dollar wasted on duplicate storage is a dollar not available for the tools social workers in the Tenderloin or transit planners at the SFMTA's Potrero Division actually need. Cloud storage costs have risen sharply since 2023, and municipal contracts for services like Microsoft Azure and Amazon Web Services are up for renegotiation within the next 18 months, making the cleanup window narrow.
What the Specialists Are Saying
Digital-records specialists consulted by city-contracted vendors have pointed to three root causes: staff uploading the same file from multiple devices without a deduplication check, legacy CMS platforms at agencies like the Department of Public Health that predate modern hash-verification tools, and a lack of a citywide image taxonomy standard. The San Francisco Public Library's digital collections team — which manages more than 200,000 digitized historical photographs through its San Francisco History Center at the Main Branch on Larkin Street — completed its own deduplication audit in early 2025 and reportedly reduced its active image repository by roughly 18 percent, freeing meaningful server capacity without losing a single unique record.
Technologists in the local civic-tech community, including members of Code for San Francisco, a volunteer brigade that meets regularly and has collaborated with the city on open-data projects since 2012, argue the fix is less about new software and more about governance. Establishing a single authoritative image repository — with mandatory duplicate detection at the point of upload — would solve most of the problem before it compounds further. The city's Open Data program, administered through DataSF, already enforces schema standards on tabular datasets; image assets have not received the same treatment.
Budget Stakes and Next Steps
The financial exposure is not trivial. Industry benchmarks suggest that unmanaged duplicate files can account for between 20 and 30 percent of an organization's total stored data, according to research cited in federal digital-records guidance published by the National Archives. For a city the size of San Francisco, which allocated approximately $56 million to technology infrastructure in the current fiscal year, even a 5 percent reduction in redundant storage could translate to meaningful savings in annual vendor contract costs.
The Controller's Office, working alongside the Department of Technology, is expected to include duplicate-asset remediation as a line item in the FY2026-27 budget proposal due later this fall. Agencies with the highest volume of image uploads — the Department of Building Inspection, which processes permit documentation for every construction project from the Dogpatch to the Richmond District, and the Human Services Agency, which maintains photographic records for benefits and housing-placement cases — are likely to be the first required to comply with any new deduplication mandate.
For residents and advocates watching the city's digital reform efforts, the practical upshot is straightforward: faster portal load times, fewer data-retrieval errors in case management, and a cleaner foundation for the AI-assisted tools the Mayor's Office of Innovation has been piloting since late 2024. City technology staff say a phased deduplication rollout, starting with the highest-volume agencies, could begin as early as October. The window for public comment on the Department of Technology's updated digital-asset standards is expected to open through the city's standard notice process at SF.gov before the end of August.