The Daily San Francisco

San Francisco news, every day

News

SF City Agencies Accelerate Push to Purge Duplicate Images From Public Records Systems

A citywide audit this week exposed thousands of redundant files clogging San Francisco's digital archives, prompting a coordinated cleanup effort across multiple departments.

By San Francisco News Desk · Published 4 July 2026, 12:41 pm

3 min read

SF City Agencies Accelerate Push to Purge Duplicate Images From Public Records Systems
Photo: Photo by Brett Sayles on Pexels

San Francisco's Department of Technology logged more than 14,000 duplicate image files in municipal records databases this week, triggering an emergency cleanup protocol that touches everything from Planning Department permit archives on South Van Ness Avenue to the public-facing property records portal maintained by the Assessor-Recorder's office on Dr. Carlton B. Goodlett Place. The redundancies, some dating back to a 2019 server migration, have slowed document retrieval times and inflated cloud storage costs for city departments already operating under tight budget constraints.

The timing matters. San Francisco is midway through a broader digital infrastructure overhaul tied to Mayor Daniel Lurie's government efficiency agenda, which inherited a $790 million general fund deficit from the previous administration. Every unnecessary gigabyte of stored data has a cost attached to it, and city technology staff say duplicated image assets — scanned permits, inspection photos, eviction notices — represent a disproportionate share of the bloat. The problem is not unique to San Francisco, but the scale here is amplified by decades of inconsistent scanning practices across 50-plus city departments that never standardized file-naming conventions.

The Department of Technology began a phased deduplication project in January, contracting with a vendor to deploy automated hash-matching software across the city's primary cloud environment. By June 30, crews had cleared redundant files from the Recreation and Parks Department's internal asset library — which manages images for facilities from Dolores Park in the Mission to the Sunset District's Golden Gate Park maintenance yards — and from the SF Public Library's digital collections system headquartered at Larkin and Fulton streets in Civic Center. This week's audit covered the Planning Department and the Department of Building Inspection, two of the highest-volume imaging operations in city government.

What the Audit Found — and What It's Costing

Building Inspection alone had accumulated roughly 6,200 duplicate inspection photographs, many of them uploaded twice during a 2022 transition to a new permit-tracking platform called Accela. Each duplicate consumed an average of 4.2 megabytes — small individually, but significant in aggregate. Cloud storage for city operations runs approximately $0.023 per gigabyte per month under the current contract, according to publicly available city budget documents from the fiscal year 2025-26 adopted budget. The duplicate image backlog across all departments examined this week represented an estimated 180 gigabytes of redundant data.

The SF Planning Department's archives present a more complex challenge. Permit case files often include scanned drawings, neighborhood photos, and environmental review images that were uploaded by multiple staff members at different stages of a project's lifecycle. Deduplication software flags exact-copy files automatically, but near-duplicates — slightly different crops or resolutions of the same image — require human review. The department has assigned two full-time staff members through August to manually verify flagged files before deletion, a precaution taken after a 2023 incident in which an automated cleanup in another city system permanently deleted documents that were not actually redundant.

What Comes Next for City Departments

The Department of Technology plans to extend the deduplication sweep to the Controller's Office financial records system and the Human Services Agency case management platform before the end of July. Both departments handle sensitive client documents, which means the work will require additional legal review before any files are removed. The City Attorney's office issued guidance in May establishing a 90-day hold requirement before deletion of any document class that could be subject to a public records request.

For residents and businesses that regularly interact with city permitting systems — contractors filing on projects in SoMa, architects pulling records for Tenderloin rehabilitation projects — the practical upshot is faster load times and more reliable document retrieval from the city's online portal at sfgov.org. The Department of Technology expects the portal's average document retrieval time to drop from roughly 8.3 seconds to under 3 seconds once the full deduplication project wraps up in September. Whether that timeline holds depends on how many near-duplicate anomalies surface in the remaining department sweeps — and city technology staff acknowledge the human review queue is already running about two weeks behind schedule.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.