The Daily San Francisco

San Francisco news, every day

News

How San Francisco's Public Records Became a Duplicate Image Problem — and Why Nobody Fixed It Sooner

Years of siloed city databases, underfunded IT departments, and rapid-fire agency mergers left San Francisco's digital archives riddled with redundant imagery that now costs real money to store and search.

By San Francisco News Desk · Published 4 July 2026, 11:51 am

3 min read

How San Francisco's Public Records Became a Duplicate Image Problem — and Why Nobody Fixed It Sooner
Photo: Sons of the American revolution. California society Hubbard, Adolphus S., [from old catalog] comp / Public domain (Wikimedia Commons)

San Francisco's municipal digital infrastructure holds tens of thousands of duplicate photographs — the same pothole, the same shelter entrance on Seventh Street, the same broken Muni escalator at Powell Station — stored multiple times across at least a dozen separate city databases. That redundancy is not a minor housekeeping issue. It is the accumulated consequence of two decades of decisions that prioritized speed of deployment over data discipline, and officials are only now beginning to grapple seriously with the cleanup.

The timing matters. The city's Department of Technology, based on South Van Ness Avenue, is in the middle of a multi-year cloud migration that was accelerated after a 2023 Controller's Office audit found significant inefficiencies in legacy on-premise storage contracts. Migrating bloated, duplicate-heavy archives to cloud infrastructure multiplies costs rather than reducing them. Each redundant image file that travels to a new cloud environment is a line item on a vendor invoice.

How the Duplication Accumulated

The problem did not arrive overnight. Between 2010 and 2020, the city launched or absorbed at least seven major citizen-facing digital platforms — including SF311, the DataSF open data portal, the Department of Public Works asset-management system, and the Planning Department's environmental review archive — each of which ingested photographs independently, with no shared deduplication protocol. When a resident submitted a complaint through SF311 about a sidewalk crack on Valencia Street in the Mission, that image might also appear in a DPW work order, a Public Works inspection record, and a Planning-adjacent streetscape file, all stored separately.

The problem compounded after 2020 when the Department of Homelessness and Supportive Housing, working to document shelter conditions across Navigation Centers including the facility on Division Street, began uploading site photographs to its own SharePoint environment without cross-referencing existing city image libraries. Similar patterns emerged inside the San Francisco Municipal Transportation Agency, where engineering teams photographing track infrastructure at West Portal and Embarcadero stations maintained their own local archives with no automated checks against SFMTA's central repository.

A 2024 report from the Budget and Legislative Analyst's Office — whose findings are a matter of public record — noted that fragmented data governance across city departments was a recurring obstacle to digital modernization. While that report did not address duplicate images specifically, it identified interoperability failures between departmental systems as a structural issue costing the city in both staff hours and storage overhead.

What a Fix Actually Requires

Duplicate image replacement — the process of identifying redundant files, designating a single canonical version, replacing all references to duplicates with pointers to that canonical file, and deleting the rest — is technically straightforward but organizationally complex. It requires department heads to agree on which version of an image is authoritative, a question that sounds trivial until a Planning file and a DPW file contain the same photograph tagged with different metadata, different dates, and different project codes.

The Department of Technology began piloting a deduplication tool in late 2025 on a subset of SF311 image archives, according to publicly available procurement records on the city's supplier portal. That pilot covered roughly 400,000 image files and was contracted to a vendor for an amount the city disclosed in its October 2025 Board of Supervisors budget supplement. Full citywide rollout has not been scheduled.

The practical stakes are not abstract. Cloud storage costs are not trivial at municipal scale, and as the city pushes forward with AI-assisted services — including a pilot chatbot for permit inquiries managed through the SF Digital Services office on Market Street — the quality and cleanliness of underlying image data directly affects how well those tools perform. Garbage in, garbage out remains as true for a 2026 AI system as it was for the first city databases built in the 1990s.

For residents, the immediate implication is simpler: if you have ever filed the same complaint twice through SF311 because you were not sure the first one registered, there is a reasonable chance two photographs of your problem are sitting in separate folders somewhere on a city server, each adding a few cents per month to San Francisco's growing cloud bill. Multiplied across millions of submissions over many years, those cents add up.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.