SF's Duplicate Image Problem: Key Decisions Ahead as City Rethinks Its Visual Archives
San Francisco's public agencies and nonprofits are sitting on thousands of redundant digital images — and the clock is ticking on who pays to fix it.
San Francisco's public agencies and nonprofits are sitting on thousands of redundant digital images — and the clock is ticking on who pays to fix it.

San Francisco's city departments and major cultural institutions are confronting a mounting backlog of duplicate digital images stored across fragmented servers, a problem that has quietly ballooned as AI-era tools finally make bulk detection cheap enough to act on. The question now is not whether to clean house, but who decides what gets deleted, who owns the canonical copy, and how much of the Municipal Archives' estimated decades of digitized material survives the cut.
The timing matters because San Francisco is mid-stream on several overlapping technology modernization pushes. The city's Department of Technology, headquartered at 1 Dr. Carlton B. Goodlett Place in Civic Center, has been consolidating legacy storage contracts since 2024. Meanwhile, the San Francisco Public Library's Digital Collections program, based out of the main branch on Larkin Street, has been ingesting tens of thousands of historical photographs, neighborhood survey images, and program documentation — much of it scanned multiple times by different grant-funded projects that never coordinated with each other.
Duplicate image management sounds like a back-office IT headache, but in a city where visual documentation of displacement, street conditions, and neighborhood change feeds directly into homelessness policy reports and housing emergency declarations, the stakes are real. A photograph misfiled or deleted without a proper deaccession review can vanish permanently from the public record.
The San Francisco Arts Commission, which maintains image archives for hundreds of publicly funded murals and installations across the Mission District, the Tenderloin, and Bayview-Hunters Point, has at least three separate cataloging systems in use simultaneously, according to city budget documents reviewed during the 2025-26 fiscal year process. Each system has generated its own thumbnail copies, web-resolution exports, and print-quality masters — meaning a single mural photograph can exist in six or more derivative versions spread across incompatible databases.
The cost of cloud storage is not trivial. Commercial object storage rates run roughly $0.023 per gigabyte per month at standard tiers as of mid-2026. For an archive holding several hundred terabytes of image data — a realistic figure for a city department with 15 years of digital accumulation — that translates to tens of thousands of dollars annually just for redundant copies that serve no distinct archival purpose.
Three decisions are now live and will shape what San Francisco's digital image infrastructure looks like by early 2027. First, the city's Digital Services office is expected to finalize vendor selection for a unified digital asset management platform before the end of September 2026. That contract will determine whether departments are required to migrate into a single deduplication-capable system or are allowed to maintain parallel archives with federated search on top.
Second, the San Francisco Public Library is conducting an internal audit of its Digital Collections holdings this summer, with a working group that includes staff from the San Francisco History Center on the sixth floor of the main library. Librarians there are pressing for a community review process before any bulk deletion occurs, arguing that images flagged as duplicates by automated tools often carry distinct metadata — a different photographer credit, a different acquisition date — that makes them independently valuable for researchers tracing the Western Addition's redevelopment history or documenting the evolution of the Castro neighborhood over decades.
Third, the Arts Commission is in preliminary talks with the Exploratorium at Pier 15, which has developed its own image deduplication workflow for scientific and educational content, about whether that methodology could be adapted for public art documentation. No formal agreement exists yet.
For residents and researchers who depend on these archives — genealogists working at the History Center, urban planners pulling street-level images from the Planning Department's permit databases, journalists reconstructing the timeline of encampment clearances along Division Street — the practical advice is the same: submit public records requests now for specific image sets you need, before any consolidation process begins. Once a deduplication sweep runs and deletions are authorized, retrieval from backup tapes is slow, expensive, and not guaranteed.
How does this story make you feel?
Spread the word
About this article
Published by The Daily San Francisco
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News