The Daily San Francisco

San Francisco news, every day

News

SF's Digital Audit Reveals Thousands of Duplicate Images Clogging City Servers — and a Growing Bill

A data cleanup effort across San Francisco's municipal systems has uncovered a surprisingly costly redundancy problem buried in years of unmanaged file storage.

By San Francisco News Desk · Published 4 July 2026, 12:16 pm

3 min read

SF's Digital Audit Reveals Thousands of Duplicate Images Clogging City Servers — and a Growing Bill
Photo: Photo by Clément Proust on Pexels

San Francisco's Department of Technology has flagged more than 340,000 duplicate image files spread across municipal servers, a figure that emerged from an internal audit completed in late June and shared with the Board of Supervisors' Budget and Finance Committee last week. The redundant files — many of them scanned permits, inspection photos, and public-facing web assets stored multiple times across overlapping systems — are consuming roughly 18 terabytes of city-managed cloud storage at an estimated annual cost of $220,000 in unnecessary hosting fees.

The timing matters. San Francisco is heading into a budget cycle with a projected general fund shortfall of around $800 million over the next two fiscal years, and city departments are under pressure to find savings wherever they can. Digital housekeeping, long treated as an afterthought in municipal IT planning, has suddenly become a line item worth scrutinizing.

Where the Clutter Accumulates

The problem is not confined to one department. The audit, which reviewed storage systems across 14 city agencies, found that the Planning Department's permit portal — used heavily by developers and contractors filing projects in neighborhoods from the Tenderloin to Dogpatch — contained duplicate image sets for at least 60,000 permit applications, in some cases because applicants re-uploaded identical documents after system timeouts. The Department of Public Works, which manages asset photography for infrastructure projects along corridors including Van Ness Avenue and Cesar Chavez Street, had an estimated 47,000 redundant photos stored in at least three separate repositories simultaneously.

The San Francisco Public Library's digital archive system, centered at the main branch on Larkin Street in Civic Center, was also flagged, though library officials noted that some apparent duplicates in their collections are intentional preservation redundancies required by archival standards — a distinction the audit's automated scanning tools did not initially recognize. The Department of Technology said it is working with individual agencies to distinguish deliberate redundancy from accidental duplication before files are deleted.

City records show that San Francisco migrated much of its storage infrastructure to hybrid cloud systems between 2019 and 2022 — a transition period during which file-naming conventions were inconsistent across departments, making automated deduplication difficult. That migration coincided with a surge in digital submissions during the COVID-19 pandemic, when in-person permitting stopped and agencies scrambled to accept scanned documents they had never processed at scale before.

What the Numbers Actually Mean

Eighteen terabytes sounds abstract. In practical terms, it is equivalent to roughly 4.5 million high-resolution photographs stored in perpetuity. The $220,000 annual figure is a conservative estimate based on current enterprise cloud storage rates; city technology staff told the Budget and Finance Committee that the actual cost could be higher once internal staff time spent managing the bloated directories is factored in.

Deduplication software capable of processing the city's volume of files runs between $40,000 and $90,000 for an enterprise license, according to publicly available vendor pricing from companies including Veritas and Commvault. If the Department of Technology moves forward with a procurement, the city would likely recover that cost within 18 months through reduced storage fees — though a formal cost-benefit analysis has not yet been published.

The audit also found that roughly 12 percent of the duplicate images were publicly accessible through city-run web portals, meaning San Francisco was in some cases serving the same image file from multiple URLs simultaneously, slowing page-load times on sites like SF.gov during peak traffic periods.

The Department of Technology is expected to present a remediation proposal to the full Board of Supervisors by September 2026. Residents and contractors who regularly upload files through city portals — including the SF311 app and the Planning Department's online permit system on Sansome Street — will likely see interface changes aimed at preventing duplicate submissions before they happen. The city is also exploring whether a centralized digital asset management system, potentially shared across departments, could prevent the problem from recurring as San Francisco continues expanding its digital services.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.