San Francisco's Department of Technology has been quietly working through one of the most unglamorous problems in municipal government: tens of thousands of duplicate digital images scattered across city servers, department SharePoint drives, and legacy content management systems that date back, in some cases, to the early 2000s. The redundancy isn't trivial. Storage costs, licensing fees, and the staff hours required to manage bloated archives have added up to a measurable drain on budgets already squeezed by the city's well-documented fiscal shortfalls.
The problem didn't appear overnight. It accumulated across two decades of digital expansion in which each city department largely built its own records infrastructure, with little coordination from the center. The Department of Public Health, the Planning Department on Seventh Street, and the San Francisco Municipal Transportation Agency all expanded their digital photo archives independently — capturing everything from building inspection records to Muni route documentation to public-health outreach campaigns. When cloud migration became city policy around 2019, those siloed collections were largely uploaded as-is, duplicates included.
A Paper Trail Through the City's Digital Past
The roots of the current cleanup effort trace to a 2023 audit by the City Controller's Office, which flagged redundant digital assets as a budget efficiency issue across multiple departments. That audit did not publish a single citywide duplicate-image count, but it identified the problem as systemic enough to warrant a coordinated response. The Department of Technology subsequently launched what it called a Digital Asset Consolidation initiative, targeting first the highest-volume offenders: the Office of Economic and Workforce Development, which maintained overlapping photo libraries from years of neighborhood grant programs, and SF Environment, which had accumulated duplicate imagery from outreach campaigns in the Tenderloin, Bayview, and Excelsior districts.
San Francisco's experience mirrors what other large American cities — Chicago, Seattle, and New York — encountered when they accelerated cloud adoption. The difference here is that San Francisco's tech-sector adjacency created an expectation that city IT would be ahead of the curve. It wasn't. Budget cycles prioritized public-facing apps and 311 upgrades while the back-end grew messier. Layoffs in the private tech sector after 2022 actually helped the city for once: the Department of Technology was able to recruit experienced data engineers at salaries that would have been uncompetitive two years earlier, and several of those hires came with direct experience managing image deduplication pipelines at scale.
What the Fix Actually Looks Like
The current remediation work involves a combination of automated hash-matching software and manual review queues handled by staff at the city's Civic Center offices. Hash-matching — a process that generates a unique digital fingerprint for each image file — can identify exact duplicates in seconds. Near-duplicates, such as photos taken seconds apart at the same location or images that differ only in resolution, require human review. That slower process is where the work is bottlenecked as of this spring.
The SFMTA alone is estimated to hold image archives spanning more than fifteen years of street and infrastructure documentation. Much of that material was captured as part of Vision Zero compliance records and transit corridor studies along corridors including Van Ness Avenue and Geary Boulevard. Consolidating those files without losing the metadata that makes them legally and operationally useful is painstaking work. Deleting the wrong version of an image — say, the one with GPS coordinates attached — could compromise a legal record.
For residents and businesses, the practical consequence of a cleaner city image archive is faster permit processing and more reliable public document requests under the Sunshine Ordinance. When Planning Department staff spend less time navigating cluttered shared drives to find a specific property photograph, turnaround times on permit applications improve. The Department of Technology has set an internal target of completing the first phase of deduplication across five major departments by the end of the 2026 fiscal year, which closes June 30, 2027. Whether the timeline holds will depend on staffing levels and whether the current budget negotiations at City Hall produce cuts to the department's operating allocation.