The Daily San Francisco

San Francisco news, every day

News

San Francisco's Digital Records Are Riddled With Duplicate Images — and It's Costing Residents Real Money

City agencies and nonprofits are quietly grappling with redundant photo files bloating databases and slowing down services that tens of thousands of San Franciscans depend on every day.

By San Francisco News Desk · Published 4 July 2026, 12:12 pm

4 min read

San Francisco's Digital Records Are Riddled With Duplicate Images — and It's Costing Residents Real Money
Photo: Photo by Marija Piliskic on Pexels

San Francisco's network of public-facing digital databases — from the Planning Department's permit portal on Polk Street to the Department of Public Health's housing inspection records — is carrying a hidden drag on performance: tens of thousands of duplicate image files that inflate storage costs, slow search results, and, in at least one documented case this spring, contributed to a processing delay for affordable housing applications in the Tenderloin.

The problem is not unique to city government. Libraries, nonprofits, and community health clinics across the Bay Area have quietly flagged the same issue to IT contractors in recent months, as the cost of cloud storage continues to climb and the consequences of slow digital infrastructure show up in wait times, failed uploads, and case-management errors that hit the most vulnerable residents hardest.

This matters now because San Francisco is midway through a multi-agency push to digitize decades of paper records — a process accelerated under the city's 2024 Digital Equity Initiative. As that effort scales up, the risk of duplicate image files multiplying across interconnected systems grows with it. A redundant file in one department's archive can propagate through shared databases, compounding the problem every time a record is exported or migrated.

Where the Problem Shows Up

At the San Francisco Public Library's main branch on Larkin Street, staff managing the digitized photograph collection of the San Francisco History Center have dealt with duplicate image ingestion since at least 2023, when a batch import from a legacy hard drive created thousands of near-identical TIFF files. The library's digital team has been working to deduplicate that archive without erasing historically significant variants — a distinction that requires human review, not just an automated script.

Meanwhile, Tenderloin Housing Clinic, which manages supportive housing placements for several hundred formerly unhoused residents, uses a case-management platform that requires photo uploads for client identification records. Staff there have described situations where the same intake photo appears multiple times across a single case file, creating confusion during housing placement reviews. The clinic serves clients across properties from the Civic Center area to the lower Tenderloin, and any delay in case processing has direct consequences for people waiting on placements.

The SF Planning Department's online permit tracker, which logged more than 140,000 permit applications in fiscal year 2024–25 according to department figures, requires applicants to upload site photos. Duplicate uploads — often the result of users re-submitting after a failed connection — are not automatically detected, meaning reviewers sometimes open files containing three or four identical images of the same property, slowing turnaround on permits that community groups and small contractors have complained are already too slow.

What Duplication Actually Costs

Cloud storage is cheap in isolation — Amazon Web Services and Google Cloud both price standard storage at fractions of a cent per gigabyte per month. But at institutional scale, the costs accumulate fast. A single city department storing 500,000 redundant image files, each averaging 4 megabytes, is carrying roughly 2 terabytes of unnecessary data. At enterprise rates that can reach $50 per terabyte per month when backup and transfer fees are included, that adds up to thousands of dollars annually in wasted spend — money that, in a city facing a projected $876 million budget deficit through fiscal year 2026–27, city officials have said they are under pressure to recover.

Beyond the dollar figure, duplicate images slow database queries. For a housing inspector trying to pull up a property's photo history on a tablet at a Bayview building site, a bloated database means longer load times and a higher chance of a session timeout — small frustrations that compound into meaningful inefficiencies across a department with hundreds of field staff.

The fix is neither glamorous nor expensive. Deduplication software can scan image libraries using perceptual hashing — a method that catches near-identical files even when file names differ — and flag duplicates for human review before deletion. Several city contractors have already proposed phased deduplication audits to departments including the Office of Digital Services at City Hall. Residents who upload documents to city portals can help by checking for upload confirmation before resubmitting. For nonprofit and community organizations managing their own databases, the San Francisco Department of Technology offers a free digital infrastructure consultation program — details available through the city's sfgov.org portal — that covers storage audits as a standard service.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.