The Daily San Francisco

San Francisco news, every day

News

SF's Digital Records Push Hits a Snag: Officials and Experts Weigh In on the City's Duplicate Image Problem

As San Francisco accelerates its push to digitize public records, a growing chorus of archivists, tech specialists, and city officials are sounding alarms about redundant scanned images clogging government databases.

By San Francisco News Desk · Published 4 July 2026, 12:06 pm

3 min read

SF's Digital Records Push Hits a Snag: Officials and Experts Weigh In on the City's Duplicate Image Problem
Photo: Photo by Brett Sayles on Pexels

San Francisco's Department of Technology is confronting a problem hiding in plain sight: thousands of duplicate scanned images embedded across city databases are eating up storage, slowing retrieval times, and inflating the cost of the city's ongoing digital records modernization effort. The issue has drawn pointed attention from archivists at the San Francisco History Center, data specialists working with the City Administrator's Office, and outside technology consultants brought in to audit the system earlier this year.

The timing matters. The city is mid-way through a multi-year initiative to digitize paper records held at City Hall and at satellite offices from the Civic Center complex to the Department of Building Inspection on Duboce Street. Officials had projected the project would cut retrieval costs and make public records more accessible under California's Public Records Act. Instead, a January 2026 internal audit flagged that redundant image files — the same document scanned multiple times and stored without deduplication protocols — were consuming an estimated 30 percent of allocated cloud storage capacity. That figure, cited in materials reviewed by The Daily San Francisco, has prompted an unscheduled mid-project review.

What the Experts Are Saying

Specialists in government records management say San Francisco is not alone, but the scale here is notable. Digital archivists consulting with the San Francisco Public Library's preservation division have pointed to a structural gap: the city's scanning workflow, built out under a contract awarded in late 2023, did not include automated hash-based deduplication — a standard practice that compares file fingerprints to prevent storing identical images twice. Without it, every batch upload from a satellite office risks layering copies on top of copies.

The San Francisco chapter of ARMA International, a professional association for records managers, held a working session on the issue in May at a Mission District conference facility, where practitioners described the problem as a foreseeable consequence of procurement moving faster than policy. No single city official has been publicly named as responsible for the gap, and the Department of Technology has not released a formal statement attributing fault.

Tech industry observers watching from SoMa and the Mid-Market corridor — where many of the city's data infrastructure vendors are headquartered — note that the fix itself is not technically complicated. Deduplication software licenses from enterprise vendors typically run between $15,000 and $80,000 annually depending on storage scale, according to publicly available vendor pricing sheets. The harder part, practitioners say, is retrofitting the protocol onto a system already in motion without interrupting live records access.

The Practical Stakes for City Services

For San Franciscans trying to pull building permits, trace property records, or access historical documents, the duplicate image backlog creates tangible friction. Staff at the San Francisco Recorder's Office, which handles property and vital records at City Hall's ground floor, have fielded complaints about search latency — cases where a records query takes minutes rather than seconds because the index is scanning across duplicate entries before surfacing the correct file.

The Board of Supervisors' Government Audit and Oversight Committee is expected to take up the issue at its next scheduled session in August 2026. Committee members have requested a remediation timeline from the Department of Technology, along with a cost estimate for retroactive deduplication across the existing archive. That estimate has not yet been delivered publicly.

For residents and small businesses relying on city records — contractors pulling permits near the Dogpatch waterfront, nonprofits filing for city grants, legal firms working on property disputes in the Sunset District — the practical advice from records specialists right now is straightforward: if a document request through the city's online portal is returning errors or unusual delays, submit a written backup request directly to the relevant department. Paper-trail redundancy, ironically, remains the safest bet until the digital redundancy is sorted out.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.