San Francisco's Department of Technology confirmed this week that a months-long audit of city-held digital image archives has uncovered tens of thousands of duplicate files stored across at least a dozen municipal agencies, triggering an emergency cleanup effort that administrators say must be completed before a new unified records platform goes live in the fall. The problem is bigger than anyone anticipated when the project launched in March.
The timing matters. The city is mid-transition to a consolidated cloud-based document management system, a project that falls under the broader San Francisco Digital Services modernization program housed at City Hall. Migrating bloated, redundant data into a new platform doesn't just slow the process — it raises storage costs and creates compliance headaches under California's Public Records Act. Every duplicated image file that crosses into the new system has to be logged, categorized, and potentially disclosed in response to public records requests.
Where the Backlog Is Worst
The heaviest concentrations of duplicate image files have turned up inside the Planning Department on Seventh Street and the Department of Public Works, whose operations extend across facilities from the Civic Center to the Islais Creek maintenance yard in the Bayview. Both departments have relied for years on separate, siloed photo-documentation workflows — field crews uploading site inspection photos through one portal, administrative staff archiving permit images through another — with no automatic deduplication running between them.
The Planning Department alone is estimated to be holding multiple copies of images tied to roughly 4,000 permit cases filed between 2019 and 2024, according to city budget documents reviewed for this article. Each duplicate set must be manually reviewed before deletion to ensure nothing legally significant is lost. Staff from the Controller's Office have been pulled in to assist with the triage, working out of the San Francisco Public Library's administrative annex on Larkin Street.
The Department of Technology has engaged a Civic Center-based data services contractor to run automated hash-matching software across the affected servers — a technique that flags identical or near-identical image files before a human reviewer ever touches them. That contract, awarded under an existing city vendor agreement, is valued at $340,000 for the current fiscal year, which ends June 30, 2027.
What the City Is Doing to Fix It
The cleanup is unfolding in three phases. The first, wrapping up this week, covers image archives held by Planning and Public Works. Phase two, scheduled for August and September, moves into the Recreation and Parks Department — which maintains photographic records of Golden Gate Park capital projects and Moscone Center renovation work — and the San Francisco Municipal Transportation Agency. SFMTA's records office on Van Ness Avenue is known internally to hold duplicate safety-camera stills from Muni incidents dating back several years.
Phase three, covering smaller departments, runs through October with a hard deadline of November 1, when the new unified platform is set to accept its first live data migration. Missing that date would push the project past the fiscal quarter and require a budget amendment before the Board of Supervisors.
For residents and businesses who file permit applications or interact with city planning processes, the practical upshot is modest but real. Cleaner image records mean faster document retrieval when attorneys, contractors, or neighbors pull files on a property. The Planning Department's online permit portal, accessible at the Seventh Street offices and remotely, has already been slower than usual during the audit period as server loads fluctuate. Staff there have asked applicants to expect processing delays of two to three additional business days on image-heavy submissions through mid-July.
Anyone with a pending permit application or an outstanding public records request that involves photographs should contact the relevant department directly to check status — and expect that the bottleneck, while real, is temporary. The city's stated target is to have all duplicate images flagged, reviewed, and either archived or deleted before Labor Day weekend.