San Francisco's Department of Building Inspection has flagged and removed more than 34,000 duplicate images from its public-facing permit portal since January 2025, a figure city officials say represents roughly 18 percent of the redundant visual clutter that had accumulated in the database over nearly a decade of inconsistent digital uploads. The cleanup matters more than it sounds: when inspectors pull records for a fire-damaged Tenderloin rooming house or a contested Mission District renovation, finding the right photo fast can be the difference between a same-day red tag and a week of administrative delay.
The timing is pointed. San Francisco is in the middle of a housing production emergency — the city needs to permit roughly 82,000 new units by 2031 under its state-mandated Regional Housing Needs Allocation — and any friction inside the permitting pipeline draws scrutiny. Councilmembers and housing advocates have spent two years arguing that bureaucratic slowdowns at 49 South Van Ness Avenue, the DBI's headquarters, are adding months to approvals. Duplicate image records are a symptom of a deeper data hygiene problem that slows automated permit-tracking software and inflates cloud storage costs.
How San Francisco Compares
Amsterdam's municipal building authority completed a comparable database deduplication in 2023, using a computer-vision tool built in-house by the city's Digital Infrastructure team. The Dutch city eliminated duplicates across 1.2 million permit records in roughly eight months, about half the timeline San Francisco is projecting for a database one-third the size. Seoul's Smart City Division, operating under the Seoul Digital Foundation, automated the entire process in 2024 using a hash-matching algorithm that runs on every image upload in real time, meaning new duplicates are blocked before they enter the system at all. San Francisco has not yet reached that stage.
Locally, the Office of the City Administrator partnered with San Francisco-based startup Archivist Labs, which holds a $1.2 million contract through June 2027, to deploy an AI-assisted deduplication tool across the DBI's legacy database. The San Francisco Planning Department, which maintains a separate but overlapping image archive on its own permit portal at 1650 Mission Street, is running a parallel but uncoordinated cleanup effort, a gap that housing policy analysts at SPUR flagged in a February 2026 report as a risk of double-counting progress.
London offers a cautionary tale. The Greater London Authority attempted a centralised image deduplication project across all 33 borough planning portals starting in 2022, only to abandon it in 2024 after failing to reconcile incompatible file-naming conventions between boroughs. San Francisco's situation is analogous: DBI and Planning use different metadata schemas, and Archivist Labs has had to build a translation layer to make the two systems talk to each other. That add-on work pushed the project's completion date from December 2026 to at least March 2027, according to a scope-of-work amendment filed with the City Controller's office in April.
What Comes Next
The practical stakes for San Francisco residents and developers are real. Property owners in the Excelsior and Outer Sunset who have submitted multiple rounds of permit drawings often find their project folders cluttered with dozens of near-identical elevation photos, some mislabeled with the wrong address. That confusion has caused at least three documented cases in fiscal year 2025 in which inspectors were dispatched to the wrong unit on a multi-unit parcel, each generating a $480 re-inspection fee that the city had to waive.
Archivist Labs is expected to deliver a real-time duplicate-blocking feature by November 2026, which would bring San Francisco closer to Seoul's current standard. The Planning Department has committed to synchronising its metadata schema with DBI's by the end of the third quarter. If both milestones hold, the city could enter 2027 with a permit image system that stops creating new problems even before it finishes solving the old ones — a modest but meaningful upgrade for a housing bureaucracy under enormous pressure to move faster.