The Daily San Francisco

San Francisco news, every day

News

SF's Duplicate Image Problem: The Key Decisions That Will Shape the City's Visual Record

As the city's agencies grapple with years of redundant, mislabeled, and duplicated digital imagery across public databases, the choices made in the coming months will determine how San Francisco documents itself for decades.

By San Francisco News Desk · Published 4 July 2026, 11:45 am

4 min read

San Francisco's public records offices, planning departments, and cultural institutions are sitting on a sprawling mess of duplicated digital images — years of overlapping photographs, mislabeled building permits, and redundant archival scans that have quietly inflated storage costs and muddied official records. The question now is who moves first to fix it, and how.

The problem has become harder to ignore. As the city's Department of Technology works through a broader IT modernization push, auditors reviewing the city's digital asset infrastructure have flagged duplicate imagery as a recurring drag on system performance and a genuine liability for public records transparency. Multiple city departments — including the Planning Department on Raquel Street, the Office of the Assessor-Recorder at City Hall, and the San Francisco Public Library's History Center on Larkin Street — maintain separate image repositories that overlap significantly but do not communicate with one another.

Why the Timing Matters Now

The urgency is not accidental. San Francisco is in the middle of a housing production push that has flooded the Planning Department with permit applications — the city logged more than 4,500 residential permit submissions in 2025, according to city planning data. Each application typically includes multiple site photographs and condition surveys. With no unified deduplication protocol, staff manually cross-reference images, a time cost that compounds across hundreds of cases every quarter.

At the same time, the AI boom reshaping the city's South of Market tech corridor has put new tools directly in the hands of city technology officers. Machine learning-based deduplication — software that can identify visually identical or near-identical images without human review — is now commercially available at price points accessible to mid-sized municipal governments. The San Francisco Office of Digital Services, which has been piloting AI procurement frameworks since early 2025, is evaluating at least three vendor proposals that include image-matching components, though no contract has been awarded.

The Assessor-Recorder's office faces particular pressure. Property records at City Hall currently reference photographic attachments that in some cases exist in triplicate across different database tables — a legacy of migrations from older systems in 2018 and again in 2022. Reconciling those records matters practically: title companies, attorneys, and homeowners pulling property history for real estate transactions in neighborhoods like the Outer Sunset, Noe Valley, and the Castro rely on those images to verify condition and ownership continuity.

The Decisions Ahead

Three choices will define what comes next. First, city officials must decide whether to pursue a centralized image repository — a single city-wide digital asset system — or simply require departments to run deduplication protocols independently within their own systems. A centralized approach is faster to query but requires political agreement across fiefdoms that have historically resisted sharing infrastructure.

Second, procurement. The Office of Digital Services is expected to present its AI vendor shortlist to the city's Committee on Information Technology before the end of August 2026. Whether that committee pushes for open-source deduplication tools — cheaper but requiring more internal technical capacity — or a licensed commercial platform will shape the budget ask heading into the next fiscal cycle. San Francisco's Department of Technology operating budget for fiscal year 2025-2026 was set at roughly $140 million, and any new image management contract would likely draw from that envelope.

Third, and most consequential for public accountability, is the question of what happens to images removed from public-facing databases during deduplication. Archivists at the San Francisco History Center have raised a legitimate concern: images that appear to be duplicates may in fact document different moments or conditions. A photograph of a building on Market Street taken in 2017 and an apparently identical photograph taken in 2019 could both be historically significant even if automated systems flag them as redundant. Any deletion protocol needs a human review layer, and funding that layer takes time and budget.

City officials have not yet announced a unified timeline. But with the August committee deadline approaching and housing permit volume showing no signs of slowing, departments that wait for a perfect solution risk making the backlog structurally worse. The smarter path, according to the framework being circulated internally, is to begin with read-only deduplication flagging — marking likely duplicates without deleting anything — and build from there.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.