An automated deduplication process deployed across San Francisco's digital public archive system has deleted thousands of neighbourhood photographs over the past several months — and the people most affected say they found out only when they went looking for images that no longer existed. The San Francisco Public Library's San Francisco History Center, headquartered at the Civic Center branch on Larkin Street, has received a growing volume of complaints from residents, community groups and local historians since early 2026.
The problem is rooted in how image-management software flags visually similar files. When two photographs register above a certain similarity threshold, the system marks one as a duplicate and removes it from the public-facing catalogue. But critics argue the threshold is too aggressive, and that photographs taken moments apart — or from slightly different angles — document genuinely distinct moments in a neighbourhood's life. In dense, fast-changing cities like San Francisco, that distinction matters enormously.
Voices from the Mission, Chinatown and Beyond
Residents from several San Francisco neighbourhoods have been particularly vocal. Community members affiliated with the Chinese Historical Society of America, based on Clay Street in the heart of Chinatown, say photographs documenting the 1999 renovation of Portsmouth Square and street-level scenes from the 2003 SARS response period have been affected. Some images existed in only one copy in the public system, uploaded years ago by volunteers who have since moved away or died.
In the Mission District, members of the Precita Eyes Mural Arts Association on 24th Street say they submitted a formal concern to library administrators in March after discovering that a series of sequential photographs documenting the painting of a specific Valencia Street mural — images that together showed the work's progression over several weeks — had been collapsed into a single representative file. The other images were gone from the public portal. Precita Eyes has been cataloguing Mission murals since 1977, and volunteers described the loss as the kind that compounds quietly over time, unnoticed until someone actually needs the record.
Residents in the Bayview-Hunters Point neighbourhood flagged a separate issue: photographs submitted through the San Francisco Recreation and Parks Department's community documentation initiative — a program that ran between 2021 and 2023 and invited residents to upload images of parks and open spaces — appear to have been disproportionately affected. The initiative gathered more than 4,200 images across 47 parks. Community members say a significant portion of submissions from Candlestick Point and Gilman Playground are no longer retrievable through the public interface.
What the Archive System Does — and Doesn't — Catch
Image deduplication is standard practice in large digital repositories. The logic is straightforward: storage costs money, redundant files slow systems down, and archives accumulate duplicates over time through repeated uploads. But archivists and digital preservation specialists have long warned that visual similarity is not the same as informational redundancy — especially in historical collections, where metadata, timestamps and context give near-identical images very different evidentiary value.
The San Francisco Public Library system holds more than 200,000 digitised items in its historical collections, according to figures published on the library's own digital collections portal. Residents and volunteers have contributed a meaningful share of those items through public submission programs. Without a robust review process before deletion — or a recovery window afterward — community-contributed photographs are at particular risk, because they typically lack the institutional provenance that might trigger a manual review.
The library had not responded to a request for comment by publication time Saturday. The city's Office of Digital Services, which oversees some of the infrastructure involved, similarly had not replied.
For residents trying to recover lost images, the practical options right now are limited. Volunteers at the Internet Archive's San Francisco regional scanning project, which has worked with the History Center on past digitisation efforts, have suggested that some images may exist in earlier crawled versions of the library's public portal — though extraction is technically complex and not guaranteed. Community members who uploaded photographs directly and retained local copies are being encouraged to resubmit, this time with detailed metadata, through the library's standard submission form at sfpl.org. Advocates are also urging the Board of Supervisors to request a formal audit of the deduplication parameters before any further automated deletions proceed.