The Daily San Francisco

San Francisco news, every day

News

SF City Departments Push to Purge Duplicate Images From Public Records This Week

A coordinated cleanup effort targeting redundant digital images in municipal databases is saving storage costs and untangling years of misfiled public records across San Francisco's sprawling city systems.

By San Francisco News Desk · Published 4 July 2026, 12:25 pm

4 min read

SF City Departments Push to Purge Duplicate Images From Public Records This Week
Photo: Photo by Brett Sayles on Pexels

San Francisco city technology staff accelerated a duplicate-image-removal effort this week, targeting tens of thousands of redundant photographs and scanned documents clogging the databases that underpin everything from building permits at the Department of Building Inspection on South Van Ness Avenue to case files at the Department of Public Health. The push, described in a Department of Technology work update circulated internally on July 2, is part of a broader digital records modernization drive tied to the city's ongoing IT infrastructure overhaul.

The timing matters. San Francisco is midway through a fiscal year that saw the city controller's office project a budget shortfall requiring cuts across multiple departments. Storage costs for redundant files are one of the smallest but most avoidable line items in municipal IT budgets, and eliminating duplicate image files — which can accumulate when multiple staff members scan the same form or photograph the same property — frees up cloud resources the city pays for by the gigabyte. For a government operating more than 50 enterprise applications across its roughly 60 departments, the arithmetic adds up fast.

Where the Backlog Built Up

The problem is most acute in departments that digitized paper records quickly during the 2020-2021 pandemic period, when staff were scanning backlogs remotely and quality control was inconsistent. The Planning Department at 49 South Van Ness and the Department of Building Inspection have both flagged large pools of duplicate property photographs — images taken at different points during permit inspections that were uploaded multiple times under slightly different file names. The City Services Auditor, which sits inside the Controller's Office at City Hall, has been working with the Department of Technology since at least early 2026 to develop automated detection protocols that can flag duplicates before staff manually review them.

The San Francisco Public Library's digital collections division, headquartered at the main branch on Larkin Street in the Civic Center, faces a parallel challenge with its historical photograph archive. Librarians there have been working with the Internet Archive — the nonprofit digital library based in the Richmond District on Funston Avenue — to identify duplicate images uploaded during bulk digitization grants. The library's digital services team confirmed this week it completed a first-pass automated scan of roughly 14,000 image files in its Golden Gate Park and Haight-Ashbury neighborhood collections, flagging an estimated 2,300 potential duplicates for human review.

Tools, Costs and the AI Factor

San Francisco's Department of Technology has been piloting perceptual-hashing software — a technique that assigns a numeric fingerprint to each image so near-identical copies can be spotted even when file names differ — since January 2026. The tool, licensed from a vendor under a contract reviewed by the city's Office of Contract Administration, costs the city in the range of standard enterprise software licensing, though specific contract figures were not publicly available as of this filing. Early results from the pilot across three departments showed the software flagged potential duplicates in roughly 18 percent of image files reviewed, according to a project summary slide deck posted briefly to the city's DataSF open-data portal before being taken down for revision.

The AI boom reshaping San Francisco's private tech sector on Folsom Street and around Mission Bay has a direct municipal application here: city IT staff have been evaluating whether large-scale image-deduplication tasks can be handled by AI-assisted workflows rather than contracted vendor tools, a shift that could lower ongoing costs. Several Civic Center department heads received a briefing on that question during a June 30 inter-departmental meeting, though no procurement decision has been announced.

For residents and small businesses waiting on permit approvals or public-records requests, the practical consequence is faster turnaround. When staff cannot easily find a clean master file because the same document exists under four slightly different names in a shared drive, requests stall. The Department of Building Inspection has said it aims to clear its image-file backlog before the end of the third quarter — September 30 — which would coincide with the start of the city's next budget cycle. Residents with pending records requests can track status through SF311 or the city's online permit portal, both of which are expected to reflect cleaner underlying data once the deduplication work is complete.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.