The Daily San Francisco

San Francisco news, every day

News

How San Francisco's City Agencies Ended Up Drowning in Duplicate Digital Images — and What They're Doing About It

Years of siloed databases, rapid departmental growth, and a pandemic-era scramble to go digital left the city's records systems bloated with redundant files that now cost real money to store and manage.

By San Francisco News Desk · Published 4 July 2026, 11:58 am

3 min read

How San Francisco's City Agencies Ended Up Drowning in Duplicate Digital Images — and What They're Doing About It
Photo: Photo by Belle Co on Pexels

San Francisco's municipal technology infrastructure is carrying a problem that quietly accumulated over roughly two decades: thousands of duplicate digital images embedded across city department databases, permit portals, and public-records archives, creating storage overhead that analysts say is both costly and operationally disruptive. The San Francisco Department of Technology, based at 1 South Van Ness Avenue, confirmed earlier this year that a citywide audit of shared digital assets was underway, though specific remediation timelines have not yet been made public.

The issue matters now because the city is in the middle of an aggressive push to digitize permit approvals, housing inspection records, and public benefit applications — work that runs through the SF Planning Department on Corridor Street and through the Department of Building Inspection on Duboce Avenue. Dropping duplicate images into already-strained systems compounds storage costs and slows down search retrieval times, which in turn affects how quickly a contractor can pull a permit or how fast a caseworker can access a client file.

How the Backlog Built Up

The roots of the problem stretch back to the early 2000s, when individual city departments built their own document management systems with little coordination. The SF Public Utilities Commission, the Recreation and Parks Department, and various health agencies each procured separate platforms, and image files — inspection photos, site surveys, license scans — were uploaded repeatedly across systems with no deduplication layer in place. When the city accelerated its migration to cloud infrastructure around 2019 and 2020, many of those legacy files were bulk-transferred without cleanup, effectively copying the redundancy into the new environment.

The COVID-19 pandemic made things worse. Between March 2020 and the end of 2021, city departments scrambled to digitize paper workflows so that staff working remotely could access them. The Department of Public Health, which operates facilities across the city including Zuckerberg San Francisco General Hospital on Potrero Avenue, was among those that rapidly scanned and uploaded large volumes of documents. Speed was prioritized over data hygiene, and duplicate images multiplied.

Cloud storage is not free. Industry pricing from major providers in 2025 placed standard object storage at roughly $0.02 per gigabyte per month — a figure that sounds trivial until a city organization is managing tens of millions of image files, some duplicated three or four times over. For context, the city's annual technology budget for fiscal year 2025-26 was approved at approximately $150 million, and storage inefficiencies represent a real, if hard-to-isolate, line item within that figure. The Department of Technology has not publicly broken out what percentage of that budget goes to redundant data storage.

The Push for a Fix

The current remediation effort is being driven in part by pressure from the City Controller's Office, which identified digital asset management as an area ripe for cost savings in its most recent operational efficiency review. The plan, as described in departmental briefings available through the city's public records portal, involves deploying hash-based deduplication tools — software that generates a unique fingerprint for each image file and flags exact or near-exact matches for human review before deletion.

Several city departments are piloting the approach first. SF Environment, headquartered on Jessie Street in SoMa, and the Office of Economic and Workforce Development on Mission Street are among the early test cases, partly because their image archives are smaller and the risk of accidental deletion lower. The lessons learned there are expected to inform a broader rollout to higher-stakes systems like building permits and health records in 2027.

For residents and businesses, the practical upshot is straightforward: faster online portals, fewer error messages when uploading documents to city sites, and — if the deduplication project delivers on its projections — marginal but real savings redirected toward frontline services. Anyone who has tried to submit an application through SF's online permit system on a busy weekday afternoon knows the system's current limitations. The city's technology team is betting that cleaning up what's already in the pipeline is the fastest path to making it run better.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.