San Francisco's public agencies are sitting on tens of thousands of duplicate digital images across government databases, archival servers, and permit-tracking systems — and a growing chorus of city technologists, archivists, and open-government advocates says the redundancy is costing money, slowing access to public records, and undermining the reliability of the city's digital infrastructure.
The problem has been building for years, but it has sharpened lately as the city's Department of Technology pushes a broader cloud-migration initiative and as the Office of the City Clerk works to digitize decades of paper records held at City Hall, at 1 Dr. Carlton B. Goodlett Place. Storage costs for unmanaged, redundant image files have drawn scrutiny from the Budget and Legislative Analyst's office, which reviews departmental IT spending as part of the annual appropriations cycle.
Why This Is Landing on Desks Now
The immediate pressure comes from two directions. First, SFMTA's Muni transit division has been building out a real-time camera monitoring network across its bus and rail lines as part of a safety modernization program — a process that generates enormous volumes of image data and has exposed the absence of any citywide deduplication standard. Second, the San Francisco Public Library's San Francisco History Center, housed in the main branch on Larkin Street, is mid-way through a multi-year effort to digitize its photographic collections and has flagged that inconsistent file-naming conventions across contributing agencies are creating duplicate ingestion at scale.
Experts in digital asset management say the city is far from alone. Municipal governments across the country have struggled with this since the early shift to digital photography in the 2000s produced image files that were cheap to create, easy to copy, and rarely audited. But San Francisco's particular challenge is the sheer number of siloed systems — planning, public works, police, the port — that each run separate document management platforms with no shared deduplication layer sitting on top.
Representatives of the city's Department of Technology have described the problem in general terms during public IT governance meetings, characterizing it as a known technical debt issue tied to legacy procurement decisions made before cloud storage became standard. No department head has publicly committed to a specific remediation timeline.
What the Experts and Advocates Are Recommending
Digital preservation specialists consulting with the San Francisco Public Library Foundation — a nonprofit that funds programs at all 28 branch libraries — have pushed for the adoption of a perceptual hashing standard, a technique that identifies visually identical or near-identical images without requiring exact file matches. The approach is already used by large media organizations and social platforms to flag redundant assets automatically.
OpenSF, a civic-tech advocacy group that monitors city data transparency, has argued in public comment sessions that any deduplication project must be paired with a public audit of what images are being retained, how long they are kept, and under what legal authority — particularly for images captured by city surveillance systems. The group notes that California's Public Records Act, amended most recently by Proposition 59 in 2004, gives residents standing to challenge opaque retention practices.
The price tag for a serious remediation effort is not trivial. Cloud-based deduplication and metadata standardization projects for mid-sized municipal archives have run between $400,000 and $1.2 million depending on scope, according to published procurement records from comparable jurisdictions including Denver and Boston. San Francisco's situation is complicated by the number of departments involved, each of which would require separate negotiation and potentially separate contract vehicles under the city's purchasing rules.
The Board of Supervisors' Government Audit and Oversight Committee has not yet scheduled a formal hearing on the matter, but staff for at least two supervisors have requested briefings from the Department of Technology before the fall budget-implementation cycle begins in September 2026. If those conversations produce a formal directive, departments could be required to submit deduplication plans as part of their annual technology assessments — a process that already touches every major city agency. Without that mandate, advocates say, the problem will keep compounding one file at a time.