The Daily San Francisco

San Francisco news, every day

News

SF City Hall's Digital Archive Push Hits a Snag: Thousands of Duplicate Images Clog the System

A citywide effort to digitize public records and planning documents has stalled this week as technicians grapple with a backlog of redundant image files across multiple municipal databases.

By San Francisco News Desk · Published 4 July 2026, 11:58 am

4 min read

SF City Hall's Digital Archive Push Hits a Snag: Thousands of Duplicate Images Clog the System
Photo: Photo by Karam Alani on Pexels

San Francisco's Department of Technology confirmed this week that a long-running project to consolidate the city's digital document archives has been disrupted by a significant duplicate image problem — one that officials say affects tens of thousands of scanned files stored across at least three separate municipal platforms.

The timing is awkward. The city has spent the better part of two years pushing agencies to migrate paper-era records into a centralized system, part of a broader open-government initiative that the Mayor's Office of Civic Innovation has championed since early 2024. The duplicate image issue — where the same scanned permit, planning map, or public health form was uploaded multiple times under different file names — is now slowing retrieval speeds and inflating storage costs on the city's cloud infrastructure contracts.

What Happened This Week

Technicians working with the San Francisco Planning Department's digital records unit identified the core problem on Monday, July 1, when routine quality checks flagged an error rate that sources familiar with the project described as substantially higher than the acceptable threshold. The Planning Department's archive, which covers building permits and environmental impact documents for neighborhoods from the Tenderloin to the Excelsior, reportedly contains duplicate image files running back to at least 2019, when the initial digitization push began under a federal grant program.

The San Francisco Public Library's San Francisco History Center at Larkin Street, which maintains a separate but linked digital collection of civic records, has also been drawn into the review. Library staff were notified mid-week that a subset of images cross-uploaded to the shared portal would be temporarily unavailable while technicians run automated deduplication scripts. The outage affects a portion of the publicly searchable catalog but does not impact the physical archive itself.

The problem is not unique to San Francisco. Cities that accelerated digitization during and after the pandemic years — under pressure from state mandates and federal infrastructure grants — frequently encountered the same issue: multiple departments scanning the same documents independently, without a unified naming protocol or hash-verification system to catch redundant uploads before they compound. But for a city that has marketed itself as a global technology hub, the optics are uncomfortable, particularly as the tech sector here has pivoted hard toward AI-driven workflow automation in 2025 and 2026.

Storage Costs and the Fix

Cloud storage is not cheap at municipal scale. The city's current contract with its primary cloud infrastructure vendor — details of which are publicly available through the Controller's Office procurement portal — runs into the tens of millions of dollars annually when accounting for all departments. Duplicate image files, especially large-format scans of architectural drawings and environmental maps, consume disproportionate storage volume. A single set of high-resolution permit drawings for a mid-rise project on, say, Folsom Street in SoMa can run to several gigabytes; duplicate copies of the same file multiply that cost without adding any informational value.

The Department of Technology has engaged a contractor to run a phased deduplication process. Phase one, expected to wrap by July 18, targets the Planning Department's permit archive — the largest single repository affected. Phase two will address shared files between Planning and the Department of Building Inspection, whose offices at 49 South Van Ness Avenue maintain parallel digital records. A third phase covering the Public Library's cross-linked holdings has no confirmed start date as of Friday.

For members of the public who use the city's online portal to pull historical permits or planning records — a process heavily relied upon by contractors, real estate attorneys, and neighborhood advocacy groups in places like the Mission District and the Richmond — some searches may return incomplete results or temporary error messages through at least mid-July. The Department of Technology's service desk is advising users who need urgent documents to submit direct records requests by email, which staff are processing manually in the interim.

City officials have not publicly stated whether the deduplication work will require any budget amendment or draw on contingency funds. The Controller's Office did not respond to a request for comment before publication. The fuller accounting of the project's cost impact is expected to surface in the department's next quarterly report, due in late August.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.