San Francisco's municipal technology offices are sitting on a problem they can no longer afford to ignore: thousands of duplicate images buried inside city databases, inflating storage costs and creating real bottlenecks for departments trying to modernize their operations in 2026.
The issue has surfaced publicly this spring as the city's Department of Technology pushed forward with its DataSF platform consolidation, a multi-year project aimed at centralizing public records across more than two dozen agencies. Staff working on the migration flagged that duplicated image files — property inspection photos, permit documentation, and infrastructure survey pictures — were accounting for a disproportionate share of consumed storage across shared servers housed at the city's data center on Seventh Street in SoMa.
Why It Matters Now
The timing is pointed. San Francisco's Planning Department has been processing a record volume of permit applications under Mayor Daniel Lurie's housing production push, which inherited pressure from the state's mandate requiring the city to zone for roughly 82,000 new units by 2031. Every permit file includes attached photo documentation. When field inspectors upload images from multiple devices without deduplication protocols in place, the same photo can appear in a case file three or four times over — multiplying storage demand without adding any informational value.
BART's maintenance division faces a parallel headache. The agency's infrastructure teams photograph track conditions, tunnel walls, and station equipment across 131 miles of track. Technicians working out of the Oakland Maintenance Complex and the Daly City Yard have long used separate upload systems that don't talk to one another, meaning a single inspection walk can generate redundant image sets across two or more internal platforms. BART officials have acknowledged the broader IT consolidation challenge in public board meetings this year, though the agency has not released specific figures on storage costs tied to duplicate files.
At City Hall, the budget pressure is concrete. San Francisco's Department of Technology requested additional cloud storage funding in the fiscal year 2025-2026 budget cycle. While the department has not publicly broken out what share of that cost stems from duplicate data, data management specialists who work with municipal systems generally point to image duplication as one of the top three drivers of avoidable cloud storage spend in large government environments.
What Experts and Officials Are Recommending
The conversation has drawn in voices from San Francisco's tech sector, which is undergoing its own reckoning. After two years of layoffs that hollowed out mid-level engineering roles across companies clustered in the Mid-Market corridor and Mission Bay, the AI hiring wave has created a new class of local data infrastructure specialists. Some of those engineers have begun consulting with city agencies on exactly this kind of remediation work.
The San Francisco Digital Services team, which sits under the City Administrator's Office and has worked on projects including the city's 311 app rebuild, has been developing internal guidance on image hashing — a technical process that flags exact or near-exact duplicate files before they're written to permanent storage. The approach has been piloted in at least one city department this calendar year, though Digital Services has not yet published a formal rollout timeline.
The Controller's Office, which tracks city IT spending, has previously recommended that departments adopt deduplication tools as part of any major data migration, citing cost-efficiency goals tied to the city's broader cloud-first infrastructure policy adopted in 2023.
For residents and businesses interacting with city permitting — particularly property owners navigating the Planning Department's intake process at 49 South Van Ness Avenue — the practical downstream effect of cleaner databases would be faster case processing times and fewer instances of inspectors being sent to retrieve documentation that was already on file.
The DataSF consolidation project is expected to reach its next major milestone in the third quarter of 2026. Officials say deduplication protocols are on the agenda for that phase, though advocates for city IT modernization have pressed for a firmer public commitment and a defined budget line before the next fiscal year begins in July 2027.