The Daily San Francisco

San Francisco news, every day

News

SF City Agencies Push to Purge Duplicate Images From Public Records Systems This Week

A citywide data cleanup effort targeting redundant digital files has picked up speed, with at least three San Francisco departments working to clear backlogs before a mid-July audit deadline.

By San Francisco News Desk · Published 4 July 2026, 11:28 am

3 min read

SF City Agencies Push to Purge Duplicate Images From Public Records Systems This Week
Photo: Photo by Robert So on Pexels

San Francisco's Department of Technology moved this week to accelerate a duplicate-image replacement initiative affecting records stored across multiple city agencies, with the bulk of the work concentrated in systems used by the Planning Department on Spear Street and the Department of Building Inspection offices at 49 South Van Ness Avenue. The push comes ahead of a July 15 internal audit that will assess whether city digital archives meet updated data-integrity standards adopted earlier this year.

The effort is not cosmetic. City digital records managers have flagged that redundant image files — duplicate photographs, scanned permit documents, and repeated parcel maps stored in multiple databases simultaneously — inflate storage costs, slow retrieval speeds, and in some documented cases have caused version-control errors that delayed permit processing in the Mission District and SoMa. Those delays have drawn criticism from housing advocates who argue that any friction in the permit pipeline makes San Francisco's already difficult housing production numbers worse.

What Happened This Week

On Tuesday, July 1, the Department of Technology circulated an internal memo — obtained and reviewed by The Daily San Francisco — outlining a three-phase deduplication protocol. Phase one, which concluded Friday, involved automated scanning of the city's Oracle-based records environment to flag files with identical checksums. Phase two, running through July 11, involves human review of flagged duplicates before deletion or archival replacement. Phase three is a validation pass timed to finish before the July 15 audit window opens.

The work is being coordinated through the City's DataSF program, the open-data office housed within the Department of Technology at City Hall. DataSF has previously run smaller cleanup sprints on datasets tied to the 311 call system and Muni route records, but this week's effort is larger in scope, touching permit-application image archives that date back to the early 2000s. Staff working on the project said the oldest duplicate clusters involve TIFF-format scanned drawings uploaded during the city's initial push to digitize paper permit files — a process that generated an estimated several hundred thousand redundant files over roughly two decades, though precise counts were not available by press time.

The San Francisco Public Library's Civic Center branch and the Main Library on Larkin Street both maintain independent digital-asset collections that interface with city records, and librarians there have been notified about the deduplication timeline so they can flag any shared assets before deletion. The library system's digital collections team confirmed it is reviewing its own holdings in parallel.

Why It Matters Beyond Filing Cabinets

Storage and records work rarely generates headlines, but the downstream effects in San Francisco are real and measurable. The Planning Department processed roughly 3,400 major permit applications in the first quarter of 2026, according to city figures published in April. Even modest system slowdowns — measured in seconds per file retrieval during peak processing hours — compound across thousands of applications and contribute to the kind of administrative drag that has been cited in multiple independent reviews of the city's housing production bottleneck.

Cloud storage costs are also a factor. San Francisco spends millions annually on enterprise data infrastructure contracts, and city budget documents from the current fiscal year show the Department of Technology's infrastructure line as one of the larger discretionary items in its roughly $130 million operating budget. Eliminating verified duplicate files reduces the billable data footprint on those contracts, though the Department of Technology declined to provide a specific savings projection before the audit concludes.

For residents and contractors who interact with city permit systems — whether filing a garage-conversion application in the Sunset District or pulling historical records for a renovation in the Tenderloin — the practical result should eventually be faster load times and cleaner document histories when searching the city's online permit portal.

The July 15 audit report will be presented to the city's Committee on Information Technology, which meets the following week. If the deduplication work clears review, the Department of Technology has indicated it plans to publish updated data-retention guidelines that would require all city departments to run quarterly automated deduplication checks going forward — a shift from the current ad hoc approach that let the backlog accumulate in the first place.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.