The Daily San Francisco

San Francisco news, every day

News

SF's Digital Archives Are Full of Duplicate Images. Officials and Experts Say That's a Growing Crisis.

From the Planning Department's permit files to SFMTA's project databases, redundant image data is quietly draining city budgets and slowing government operations.

By San Francisco News Desk · Published 4 July 2026, 11:57 am

3 min read

SF's Digital Archives Are Full of Duplicate Images. Officials and Experts Say That's a Growing Crisis.
Photo: Photo by Brett Sayles on Pexels

San Francisco's municipal databases are bloated with hundreds of thousands of duplicate images — redundant permit photos, copied infrastructure scans, and replicated project files that are costing the city real money and slowing the work of agencies that can barely afford delays. Technology officials and data management specialists who work with city departments say the problem has reached a tipping point, and that a reckoning is overdue.

The issue isn't new, but it has become significantly more urgent. As San Francisco pushes forward on housing production emergency measures and SFMTA scrambles to digitize its project documentation, the volume of image files across city systems has ballooned. Agencies that digitized paper records in the early 2020s often did so with no deduplication protocols in place, meaning the same photograph or scanned document can appear dozens of times across different databases — each copy occupying server space that taxpayers fund.

What the Experts Are Saying

Data governance specialists who consult with Bay Area municipalities describe duplicate image replacement as one of the least glamorous but most consequential infrastructure problems in local government IT. The San Francisco Department of Technology, which oversees citywide digital infrastructure from its offices at 1 South Van Ness Avenue, has identified image redundancy as a target area in its ongoing IT modernization efforts, according to publicly available program documentation.

At the Planning Department, which processes permit applications for thousands of properties across neighborhoods from the Sunset to SoMa, staff have long flagged the problem of redundant property photos piling up in the Accela permitting system. A single parcel inspection can generate multiple near-identical images uploaded by different inspectors with no automated system to catch the overlap. The department handles tens of thousands of permit applications annually, and without deduplication tools, storage costs scale with volume.

Independent technologists working in San Francisco's civic tech community — including volunteers affiliated with Code for San Francisco, which convenes regularly at locations around the Civic Center area — have pointed to the problem in public forums. The organization has previously worked on open-data projects touching city records management, and members have noted that image file sizes, particularly high-resolution infrastructure photos, make redundancy especially expensive compared to text-based duplicates.

The Cost Problem Is Real

Cloud storage is not cheap at government scale. Enterprise cloud contracts for California municipalities have ranged widely, but analysts tracking public sector IT spending note that storage costs for unmanaged image archives can run to hundreds of thousands of dollars annually for a city the size of San Francisco. The city's Department of Technology budget for fiscal year 2025-26 exceeded $130 million, a figure drawn from the city's published budget documents, and storage optimization is increasingly part of the conversation about where savings can be found.

SFMTA, which is managing a significant digital documentation push tied to its Van Ness Bus Rapid Transit corridor and Central Subway operational records, faces the same pressures. Project files for major capital works routinely include thousands of construction-phase photographs, and without systematic duplicate detection, the archives grow faster than anyone cleans them.

The solution, technology managers say, is not complicated in concept: automated hashing tools can compare image files and flag or delete duplicates before they enter a database, and existing platforms used by several other large American cities already include this functionality. The harder problem is retrofitting it into legacy systems that were never designed with deduplication in mind.

For San Francisco residents and businesses who interact with city permitting or transit planning, the practical consequence shows up in slower response times when staff search overloaded databases and in IT budget pressure that competes with service delivery. City officials have not announced a specific citywide deduplication program as of this July 4th holiday weekend, but the Department of Technology's broader digital modernization roadmap — published last year — identifies data quality and storage efficiency as priorities for the coming 18 months. Whether individual agencies move fast enough to act on that before costs compound further will depend largely on how much political attention the problem attracts heading into the next budget cycle.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.