The Daily San Francisco

San Francisco news, every day

News

San Francisco Deploys AI to Eliminate Duplicate City Records Ahead of Peers

From Planning Department permit files to SFMTA transit maps, the city is deploying AI-assisted deduplication tools that other major cities are only beginning to pilot.

By San Francisco News Desk · Published 4 July 2026, 12:16 pm

4 min read

San Francisco Deploys AI to Eliminate Duplicate City Records Ahead of Peers
Photo: Reagan, Ronald, 1911-2004 / Public domain (Wikimedia Commons)

San Francisco's Department of Technology quietly expanded its digital records deduplication program to cover Planning Department permit archives this spring, targeting a backlog of roughly 2.3 million scanned documents that city archivists say contain significant volumes of duplicate or near-duplicate image files. The expansion, which began in March 2026, marks the most aggressive phase yet of a citywide push to clean up decades of redundant digital storage — and it puts San Francisco measurably ahead of comparable cities still wrestling with the same problem.

The timing matters. City agencies across the country have been forced to reckon with ballooning cloud storage costs as government digitization programs, accelerated during the pandemic years, produced vast repositories of poorly organized files. For San Francisco, where tech-sector influence and a relatively well-funded Department of Technology have long shaped how city IT operates, the pressure to demonstrate fiscal discipline with digital infrastructure has only intensified. Mayor Daniel Lurie's administration has flagged operational efficiency as a budget priority heading into fiscal year 2027, and archival deduplication — unglamorous as it sounds — sits squarely in that category.

What the Program Actually Does

The city's approach centers on a pipeline built partly on open-source perceptual hashing tools and partly on proprietary software licensed through the San Francisco-based firm that won the Department of Technology contract in late 2025. The system compares image files across permit scan batches, flags near-duplicates above a set similarity threshold, and queues them for human review before any deletion. That human-in-the-loop requirement was written into the contract specifically to address concerns raised by the City Attorney's Office about automated deletion of potential legal records.

The Planning Department's archive at 49 South Van Ness Avenue — the department relocated there from the old Civic Center offices — holds permit records stretching back to the 1980s in digital form, with earlier paper records scanned over a decade-long project. Archivists working with the Department of Technology estimate that duplicate and near-duplicate images account for somewhere between 18 and 24 percent of total stored files in the permit image database, based on a sample audit completed in January 2026. At current cloud storage pricing, eliminating that redundancy would reduce annual storage costs by a meaningful amount, though the Department of Technology has not published a final projection.

The San Francisco Public Library's Digital Collections program at the main branch on Larkin Street has been running a smaller-scale version of the same process since 2024, focused on historical photograph collections. That earlier effort provided a proof-of-concept that helped make the case for the larger Planning Department rollout.

How San Francisco Compares Globally

London's Government Digital Service began a similar deduplication pilot for planning records in two boroughs — Hackney and Southwark — in early 2026, but the effort is limited to newly uploaded documents rather than legacy archives. Singapore's Urban Redevelopment Authority has the most mature program of any comparable city government, having completed a full deduplication sweep of its land-use record system in 2024 using tools developed in-house. Tokyo's municipal government has discussed the issue publicly but has not announced a funded program.

New York City's Department of City Planning acknowledged in a March 2026 budget hearing that its digital archive contains substantial duplication but said a deduplication initiative is not funded in the current fiscal year. Chicago's Department of Buildings has no public-facing program addressing the issue.

That leaves San Francisco — despite its chronic struggles with homelessness response, housing production, and BART and Muni funding — in an unexpectedly strong position on this specific administrative modernization metric. The Department of Technology has presented the program to peer cities through the National League of Cities technology working group, according to meeting agendas posted on the NLC website.

For residents trying to track a permit on a property in the Mission District or Dogpatch, the practical payoff is faster document retrieval through the city's online permit portal, which has historically been slow when pulling scanned image files from the legacy archive. The department expects the first phase of deduplication to be complete by October 2026, with performance improvements to the public portal following shortly after. Anyone with questions about specific permit records can contact the Planning Department's public counter at 49 South Van Ness directly.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.