The Daily San Francisco

San Francisco news, every day

News

San Francisco Is Quietly Leading the Fight Against Duplicate Imagery in City Records — But Rivals Are Catching Up

From permit databases to heritage archives, SF's push to eliminate redundant digital images is becoming a municipal benchmark — and a cautionary tale about how long it took to start.

By San Francisco News Desk · Published 4 July 2026, 12:23 pm

3 min read

San Francisco Is Quietly Leading the Fight Against Duplicate Imagery in City Records — But Rivals Are Catching Up
Photo: Photo by Tom Fisk on Pexels

San Francisco's Department of Technology confirmed earlier this year that the city's consolidated digital asset repository held more than 4.2 million images across municipal agencies — a figure that includes an estimated 30 to 40 percent classified as duplicates, near-duplicates, or superseded versions of the same file. That acknowledgment, buried in a January 2026 budget presentation to the Board of Supervisors, has quietly set off a scramble to clean house before the city's new unified permitting portal goes live in the fall.

The timing matters. San Francisco is midway through a housing production push that depends on faster permit processing, and Planning Department staff have flagged that redundant imagery in the city's parcel records system slows automated review tools — the same AI-assisted tools the city began piloting in the Tenderloin and SoMa corridors in late 2025. Every duplicate a machine has to evaluate twice is time a housing application sits in a queue.

What SF Is Actually Doing

The city's primary deduplication work is being handled through a partnership between the Department of Technology and the San Francisco Planning Department, using a perceptual hashing system procured from a vendor under a contract that began in March 2026. The tool flags images with more than 95 percent visual similarity for human review before deletion. Planning staff in the Civic Center offices on Dr. Carlton B. Goodlett Place are responsible for final sign-off on any image removed from the historic survey archive, which covers more than 80,000 properties citywide.

The San Francisco Public Library's San Francisco History Center at the Main Branch on Larkin Street faces a parallel but distinct problem: its digitized collection, built over two decades of scanning drives, contains thousands of duplicate photographs of landmarks including the Ferry Building, Coit Tower, and Dolores Park. The History Center began its own deduplication project in February 2026 using open-source tooling developed at the Internet Archive, which is headquartered less than a mile away on Funston Avenue in the Richmond District. Staff there declined to provide a completion timeline through a library spokesperson.

Amsterdam and Singapore have both confronted comparable challenges at larger administrative scale. Singapore's Urban Redevelopment Authority completed a city-wide geospatial image deduplication project in 2024, cutting storage costs by roughly 22 percent across its land-use databases, according to a public technical report the URA published that December. Amsterdam's municipal archive — the Stadsarchief — launched its own project in 2023 and reported removing over 600,000 redundant image files from a collection of approximately 900,000 digitized items by mid-2025. San Francisco's effort, still in early stages, is working against a larger and less consistently formatted dataset.

The Risk of Moving Too Slowly

The stakes go beyond storage bills. San Francisco's permitting backlog has been a persistent political pressure point under Mayor Daniel Lurie, who took office in January 2025 after defeating London Breed. The city's housing element, certified by the state in 2023, requires San Francisco to plan for roughly 82,000 new units by 2031. Any friction in the digital infrastructure supporting permit review gets amplified when the volume of applications rises.

London, for comparison, began a standardized image metadata and deduplication protocol across its 32 borough councils in 2022 under a Greater London Authority digital infrastructure initiative. That program took three years and roughly £4.2 million to reach full implementation, according to a GLA budget document published in early 2025. San Francisco's current contract is considerably smaller in scope and budget, though the Department of Technology has not disclosed the vendor contract value publicly.

For residents and developers, the practical advice is straightforward: if you are submitting permit applications to the San Francisco Planning Department this summer, make sure any imagery you upload — site photos, existing condition documentation — is clearly labeled and not submitted in duplicate. Staff have indicated that redundant attachments in a single application create the same downstream processing delays that the citywide cleanup is designed to eliminate. The new unified portal, expected to launch in October 2026, will include automatic duplicate detection at the point of upload — a feature that, if it had existed a decade ago, might have made this summer's cleanup unnecessary.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.