San Francisco's Planning Department began a phased rollout in March 2026 of an automated image-deduplication system covering more than 340,000 digitised property records stored in its Accela permitting database. The system flags duplicate or near-identical photographs attached to building permits, conditional use applications, and variance filings — records that, left unchecked, consume server capacity, slow public records searches, and create confusion during zoning appeals.
The timing matters. City Hall is under pressure to cut administrative costs after the Board of Supervisors approved a fiscal year 2026–27 budget that required every department to identify at least 8 percent in operational savings. For the Planning Department, which maintains one of the largest digitised parcel-record archives west of the Mississippi, duplicate imagery has long been a low-visibility but real drain on storage contracts with the city's cloud vendor. The department manages records covering neighbourhoods from the Bayview–Hunters Point waterfront to the Sunset District's endless rows of Edwardian flats.
The program, internally called Clean Record SF, is being piloted alongside a separate initiative run by the San Francisco Digital Services office on Polk Street, which is simultaneously overhauling how residents upload documents through the city's 311 portal. Together, the two efforts represent the most systematic attempt the city has made to address data redundancy in municipal imaging since 2019, when a previous cloud migration left an estimated backlog of duplicated files across multiple departments.
How San Francisco Stacks Up Against London and Tokyo
London's equivalent effort, housed inside the Greater London Authority's data directorate, has taken a different approach. Rather than automating deduplication at the point of ingestion, GLA contracted with a third-party vendor in 2024 to run periodic batch-cleaning passes on its planning portal — a reactive model that critics inside City Hall there have described in public council testimony as slower and more expensive per-record than the preventive approach San Francisco is piloting. Tokyo's metropolitan government, managing a property record system that covers more than 9 million parcels across 23 special wards, relies on a hash-matching algorithm embedded in its 2022-era digitisation platform, which catches exact duplicates but struggles with near-duplicates — slightly cropped or recompressed versions of the same image.
San Francisco's system uses perceptual hashing combined with a convolutional neural network layer, according to a March 2026 Planning Department staff report presented to the Historic Preservation Commission. That combination lets it catch images that are visually identical even when file metadata differs — a common problem when applicants resubmit permits after minor corrections. The department has not yet published outcome data for the first quarter of the rollout, but the staff report projected that the system would flag roughly 14 percent of all newly submitted images as potential duplicates during the pilot phase through June 30.
The cost of doing nothing is measurable. The city's enterprise storage contract with a major cloud provider, disclosed in a 2025 Controller's Office report, runs at a rate that makes every unnecessary gigabyte of retained imagery a recurring line item. Peer cities that have completed similar cleanups — Singapore's Urban Redevelopment Authority finished a comparable project in late 2024 — have reported storage cost reductions in the range of 11 to 18 percent for affected record categories, according to a comparative analysis published by the International Association of Assessing Officers in January 2026.
What Comes Next for Property Applicants and City Staff
Residents and developers filing permits at the Planning Department's counter on Stevenson Street, or submitting digitally through SF Planning's online portal, will notice a new real-time flag if uploaded images closely match files already on record. The department says the flag is advisory, not a hard block, during the pilot phase — applicants can override it with a checkbox confirmation. A full enforcement mode, in which duplicate submissions are automatically rejected at upload, is tentatively scheduled to go live in January 2027, pending a review by the City Attorney's office on data retention obligations.
For neighbourhood groups tracking development in areas like the Mission District or Chinatown, the practical upshot is that public-facing permit searches on SF Planning's website should eventually return cleaner, faster results. The department has said it plans to publish a Q2 2026 deduplication audit before the end of August — that report will be the first real test of whether San Francisco's preventive model is outperforming the reactive strategies London and others have deployed.