San Francisco's Department of Technology is sitting on a problem that has compounded quietly for years: tens of thousands of duplicate digital images spread across municipal servers, bloating storage costs and making it harder for city departments to locate authentic public records. The issue surfaced formally this spring when the city's OpenData program flagged redundant image files numbering in the high five figures during a routine audit of assets tied to the SF Planning Department and the Office of the City Clerk.
The timing matters. San Francisco is in the middle of a broader push to digitize planning records along the Eastern Neighborhoods corridor — areas like the Mission District and Dogpatch — as the city tries to accelerate housing approvals under state-mandated production targets. A cluttered archive slows permit researchers, community groups and journalists trying to trace the planning history of specific parcels. When duplicate images share file names but carry different metadata, the wrong version can surface first, creating errors in official documentation.
What Officials and Experts Are Saying
City Archivist Susan Goldstein's office at the San Francisco History Center, housed inside the Main Branch of the San Francisco Public Library on Larkin Street, has spent the past several months advising the Department of Technology on de-duplication protocols. The concern from her team, according to city records reviewed by The Daily San Francisco, is that automated deletion tools trained primarily on pixel-matching algorithms can conflate near-identical images that are actually distinct historical records — a photograph taken seconds apart at a Planning Commission hearing, for instance, may document two different speakers at the podium.
That caution is shared by archivists at the Bancroft Library at UC Berkeley, who have consulted informally with the city. The professional consensus, reflected in guidelines published by the Society of American Archivists, recommends human review of any flagged batch before deletion, particularly for records older than 10 years. The SF Planning Department's digital records stretch back to scanned files from the late 1990s, some of which exist nowhere else in analog form.
Tech sector voices are pushing in a different direction. Several firms with offices in SoMa and the Mid-Market corridor have pitched AI-assisted de-duplication tools to the city's procurement office this year, arguing that machine-learning models can distinguish between truly redundant files and near-duplicates with archival value faster and more accurately than manual workflows. The city issued a Request for Information on the topic in March 2026, drawing responses from at least a dozen vendors. No contract has been awarded as of July 4.
Costs, Timelines and the Pressure to Act
Storage is not cheap. San Francisco's Department of Technology spent approximately $4.2 million on cloud and on-premise storage in fiscal year 2025, according to the city's published budget documents. Administrators estimate that duplicate and redundant files account for a meaningful share of that load, though a precise percentage has not been publicly released. Even a 10 to 15 percent reduction in stored data would translate to measurable savings at that spending level.
The Board of Supervisors' Government Audit and Oversight Committee is expected to take up the issue at a hearing tentatively scheduled for late July. Supervisor Myrna Melgar, whose district includes much of the Western Addition and Forest Hill, has been among those asking the Department of Technology for clearer metrics on how the redundancy problem developed and what safeguards would accompany any deletion initiative.
For residents and neighborhood groups trying to research property histories — a common need in rapidly changing areas like Japantown and the Tenderloin — the practical advice from city librarians is straightforward: if you need a specific planning or permit image for an ongoing project, request a certified copy from the Office of the City Clerk now, before any de-duplication work begins. The clerk's office at City Hall, Room 168, can process most image records within five business days. Archivists also recommend cross-referencing requests with the SF Planning Department's online permit portal, which carries its own image repository that will not be subject to the first phase of deletions expected later this year.