The Daily San Francisco

San Francisco news, every day

News

San Francisco's Digital Records Are Full of Duplicate Images — and It's Costing Residents More Than They Know

City agencies and nonprofits are sitting on bloated, redundant image archives that slow services, inflate storage bills, and quietly undermine the technology systems that residents depend on every day.

By San Francisco News Desk · Published 4 July 2026, 12:23 pm

3 min read

San Francisco's Digital Records Are Full of Duplicate Images — and It's Costing Residents More Than They Know
Photo: Photo by GuiGo Lopes on Pexels

San Francisco's public-facing digital infrastructure has a hidden clutter problem. Across city departments, from the Department of Homelessness and Supportive Housing to the Municipal Transportation Agency, duplicate images embedded in databases and content management systems have ballooned storage costs, slowed application load times, and created data inconsistencies that frontline workers say make an already difficult job harder. The issue came into sharper focus this spring when the Controller's Office flagged redundant media files as a contributing factor in a broader audit of the city's $340 million annual IT budget.

The timing matters. San Francisco is in the middle of an aggressive push to digitize housing applications, shelter bed tracking, and transit rider services — programs where speed and accuracy are not abstractions but lifelines. When a case manager at a Tenderloin SRO can't pull up a client's current photo ID because the system is cycling through three cached versions of the same file, that delay has real consequences. The city's Digital Services Office, based at City Hall, has been working with vendors since early 2026 to implement automated duplicate-detection routines, but the rollout has been uneven.

Where the Problem Shows Up

The drag is most visible in two places residents actually touch. The SF311 app, which logged more than 1.2 million service requests in 2025, relies on image uploads to document everything from broken streetlights on Market Street to encampments near Caltrain's 4th and King station. Internal testing found that roughly 18 percent of images submitted through the app were exact or near-exact duplicates — multiple residents photographing the same pothole, the same overflowing bin. Without automated deduplication, those redundant files stack up in cloud storage buckets the city pays for by the gigabyte on contracts with vendors including Amazon Web Services.

At the San Francisco Public Library's main branch on Larkin Street, digital archivists have been manually cleaning image libraries tied to the San Francisco Historical Photograph Collection — a project that includes more than 200,000 scanned images. Duplicate scans from batch-processing runs in the early 2010s have consumed an estimated 4 terabytes of redundant storage. Staff there said the manual review process takes roughly two hours per 1,000 images, a pace that makes automated replacement tools not a luxury but a necessity.

Nonprofits feel it too. Glide Memorial Church, which operates one of the largest meal and shelter intake systems in the Tenderloin, uses a case management platform integrated with city data systems. Duplicate profile photos — created when clients are re-entered into the system after gaps in service — have caused matching errors that delayed benefit verification for dozens of individuals over the past fiscal year, according to documents shared with the Controller's Office.

What the City Is Doing About It

The Digital Services Office awarded a $1.8 million contract in March 2026 to a San Jose-based software firm to deploy perceptual hashing tools — algorithms that detect visually similar images even when file names or metadata differ — across five city platforms by the end of the third quarter. The MTA's Clipper card portal and the city's permit management system are first in line. The Department of Technology has also issued guidance to 23 city departments recommending quarterly audits of media libraries, though compliance is voluntary for now.

For ordinary San Franciscans, the practical upside is straightforward. Faster-loading city apps, fewer errors in housing and shelter records, and lower IT overhead that, at least in theory, frees budget for direct services. The SF Digital Equity Initiative, which expanded broadband access to 14,000 low-income households in the Bayview and Excelsior districts since 2023, depends on the same backend infrastructure — and cleaner data systems mean more reliable service for residents who often have no fallback option.

Residents who notice glitches in city apps can flag them directly through SF311 or by emailing the Department of Technology's public feedback portal. The Digital Services Office says it expects the first round of automated duplicate-image replacement to be complete by September 30, 2026 — a deadline that, given the city's track record on IT projects, advocates are watching closely.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.