San Francisco's Planning Department confirmed last month that duplicate images — redundant photographs of the same properties filed multiple times across separate city databases — account for roughly 23 percent of storage load in the city's digital property records system, a problem that has quietly inflated IT costs and slowed permit processing times at the Civic Center offices on Dr. Carlton B. Goodlett Place. The department began a formal deduplication pilot in May 2026, making San Francisco one of the first U.S. cities to treat the issue as a budget and efficiency priority rather than a backroom data-hygiene problem.
Why now? Two forces collided. The city's ongoing housing production emergency — San Francisco needs to permit tens of thousands of new units under state mandates tied to its Regional Housing Needs Allocation cycle — means the Planning Department cannot afford permit queues clogged by redundant file lookups. At the same time, the post-layoff AI boom in SoMa and Mission Bay has produced a glut of affordable machine-learning tools that city procurement officers can license without the seven-figure price tags of five years ago.
What San Francisco Is Actually Doing
The pilot, run jointly by the Department of Technology and the San Francisco Planning Department, uses perceptual hashing — a technique that generates a fingerprint for each image and flags near-identical copies — across the city's Accela permit management platform. Staff at the Planning Department's offices on Seventh Street are manually reviewing flagged batches before deletion, a step city technology officers insisted on after a 2023 incident in Oakland, where an automated cleanup tool incorrectly purged before-and-after construction photos needed for code enforcement cases.
The San Francisco Public Library's digital preservation team on Larkin Street is separately advising the project, drawing on its own experience deduplicating the San Francisco History Center's photographic archive — a collection that spans more than 200,000 images and took 18 months to clean up starting in 2022. That archive project, funded partly through a California State Library grant, cut redundant storage by 31 percent, a figure city technology staff are using as a benchmark for what the Planning Department pilot might achieve.
Across the Bay, Oakland launched a comparable effort in early 2026 under its Digital Equity and Innovation Office, focusing on business license photographs rather than property records. Richmond has not yet started a formal program. Statewide, the California Department of Technology issued guidance in March 2026 encouraging county and municipal agencies to audit duplicate digital assets before the state's next data-center consolidation review, scheduled for fiscal year 2027-28.
How San Francisco Compares Globally
London's Government Digital Service ran a deduplication audit across the Greater London Authority's 33 borough databases between 2023 and 2025, ultimately removing more than 4 million redundant property photographs and cutting cloud storage costs by roughly £2.1 million annually, according to a GLA published summary. Singapore's Urban Redevelopment Authority completed a similar exercise in 2024, automating deduplication across its OneMap property platform and reporting a 40 percent reduction in image-processing lag for public-facing map queries.
San Francisco is moving faster than most comparably sized American cities — Chicago's Department of Assets, Information and Services does not have a dedicated deduplication program as of this writing — but it is operating at a smaller scale than London or Singapore, both of which had central government coordination and larger dedicated budgets. The San Francisco pilot's current budget line sits inside a broader $4.2 million Digital Infrastructure Modernization allocation approved by the Board of Supervisors in December 2025.
For property owners and developers, the practical upshot is straightforward: permit applicants at the San Francisco Planning Department's public counter on Seventh Street should expect faster digital file retrieval during pre-application meetings once the pilot clears its first full audit phase, targeted for completion by September 2026. The department is also advising applicants to label and submit photographs in standardized formats — JPEG at a minimum resolution of 1,200 by 800 pixels — to reduce the chance their own submissions generate duplicates at the intake stage. The city plans to publish deduplication progress metrics on DataSF, its open-data portal, beginning in the third quarter of this year.