San Francisco's Department of Technology has been fielding growing pressure from multiple city agencies to tackle a persistent and expensive problem buried inside public records systems: duplicate images clogging databases, inflating storage costs, and complicating public records requests. The effort, which touches everything from Planning Department permit files to the SFMTA's transit camera archive, has moved from a back-burner IT conversation to a priority agenda item heading into the second half of 2026.
The timing matters. The city is operating under a post-pandemic budget that the Controller's Office projected would face a structural deficit exceeding $800 million over two fiscal years, forcing every department to justify spending. Redundant digital storage is no longer just a technical nuisance—it carries a dollar figure that budget analysts can point to. Procurement records reviewed by The Daily San Francisco show the city renewed contracts for cloud storage infrastructure through at least fiscal year 2027, making the cost of inefficiency visible in line-item form.
What the Agencies Are Actually Facing
The Planning Department, headquartered at 49 South Van Ness Avenue, processes thousands of permit applications annually, each accompanied by multiple image uploads—site photos, architectural drawings, inspection records. Staff familiar with the workflow have described situations where the same photograph appears three, four, or more times in a single case file, the result of applicants resubmitting documents and legacy system behavior that doesn't flag duplicates at upload. The Department of Building Inspection, which shares some backend infrastructure, faces a nearly identical challenge.
Over at SFMTA, the problem takes a different shape. The agency's extensive network of Muni cameras and enforcement systems generates an enormous volume of image files daily. Technology officers there have pointed to duplicate frame captures—images taken milliseconds apart and stored as separate files—as a contributor to storage bloat. SFMTA's IT budget for fiscal year 2025-26 was publicly reported at roughly $47 million, a figure that includes infrastructure maintenance and cloud services.
Experts in municipal records management say San Francisco is not alone. Cities that digitized paper archives rapidly during the COVID-19 era, often using emergency procurement, ended up with heterogeneous systems that don't communicate well and lack built-in deduplication. The Government Technology Research Alliance has noted that mid-size and large American cities routinely discover 15 to 30 percent redundancy rates in unaudited image repositories—a range that public records officers in San Francisco have privately found credible when applied to their own systems.
Voices Shaping the Conversation
At City Hall, the conversation has reached the desk of the Chief Data Officer, whose office sits within the Department of Technology on Seventh Street. Technology policy advocates connected to organizations like Code for San Francisco—the volunteer civic tech brigade that has operated out of venues including the GitHub headquarters on 88 Colin P. Kelly Jr. Street—have been pushing for open-source deduplication tooling rather than expensive proprietary contracts. Their argument: the underlying algorithms are mature, widely available, and don't require six-figure vendor deals.
Legal observers tracking public records compliance note a separate but related concern. When a California Public Records Act request covers image files and the responsive set includes duplicates, the agency must still review each copy. That review time is billable to the requester under certain cost-recovery rules, and duplicate-heavy systems quietly inflate response costs for journalists, lawyers, and ordinary residents trying to obtain records. The San Francisco City Attorney's Office has not issued formal guidance on the deduplication question as of this writing, but attorneys familiar with CPRA practice say clarity would be welcome.
The Board of Supervisors' Government Audit and Oversight Committee is expected to take up broader digital records efficiency questions in its fall 2026 calendar. Budget and Legislative Analyst staff have the capacity to quantify storage waste in dollar terms if requested—and at least two supervisorial offices have expressed interest in doing exactly that.
For city residents, the practical upshot is straightforward: cleaner databases mean faster, cheaper public records responses and a government IT infrastructure better positioned to absorb the AI-assisted tools that multiple departments are now piloting. The Department of Technology has indicated it will release a digital infrastructure roadmap before the end of calendar year 2026. That document will be the first real test of whether duplicate image replacement moves from internal talking point to funded action.