San Francisco's Department of Technology has been quietly working through a backlog that, by its own internal project logs, ran to more than 340,000 duplicate image files embedded across city databases, permit portals, and public-records archives. The cleanup, which began in earnest in January 2026 under a mandate from the city's Digital Services Office, is now roughly 60 percent complete—and urban technology analysts say the pace is drawing attention from municipal counterparts in Amsterdam, Seoul, and London.
The timing is not accidental. Over the past 18 months, San Francisco's tech-sector implosion and subsequent AI hiring wave created an unusual labor pool: laid-off engineers from companies including Salesforce and Twitter's successor who pivoted into civic contract work. The Department of Technology has drawn on that talent to build deduplication pipelines that several other cities are now licensing or copying outright.
Why This Matters Beyond Server Costs
Duplicate images in government systems are rarely just a storage annoyance. In San Francisco, the problem compounded across at least three major platforms: the Planning Department's permit tracking system on Seventh Street, the SF311 service-request portal, and the Digital Archives held by the San Francisco History Center at the Main Library on Larkin Street. Each system had accumulated years of redundant uploads—the same inspection photograph appearing four, five, sometimes a dozen times under different case numbers or file names.
The practical consequences ranged from slowed search response times in the 311 system to, in at least some documented instances flagged by city auditors in a March 2026 report, permit reviewers working from outdated image versions because duplicate files masked which record was current. The city did not release specific figures on how many permits were affected, but the audit described the problem as systemic rather than isolated.
London's Government Digital Service began a comparable audit of borough-level planning portals in late 2024, focusing initially on the Tower Hamlets and Southwark councils. Amsterdam's municipal digital team published findings in April 2026 showing it had reduced duplicate asset storage by 41 percent over 18 months using open-source perceptual hashing tools. Seoul's Smart City division, by contrast, contracted the work to a private vendor in 2025 and publicly acknowledged the project ran 30 percent over budget.
SF's Homegrown Approach
San Francisco chose a middle path. The Department of Technology built its deduplication workflow in-house using a combination of perceptual hashing—a technique that identifies visually identical images even when file names or metadata differ—and a machine-learning classifier trained on city permit photographs. That classifier runs on infrastructure hosted at the city's Civic Center data center. The total contract value for the project, spread across two fiscal years, is listed at $2.1 million in the city's open-expenditure database.
For comparison, Seoul's outsourced effort reportedly cost the equivalent of roughly $3.4 million for a smaller dataset, according to a Seoul Metropolitan Government budget document cited in an April 2026 report by the Urban Digital Governance Forum, a Brussels-based research group. Amsterdam's in-house model cost significantly less but took longer to deploy because of staffing constraints that San Francisco, with its post-layoff engineering talent pool, largely avoided.
The SF model is not flawless. The Planning Department's Permit Center at 49 South Van Ness Avenue still has a manual review queue for flagged images that fall into an ambiguous category—photographs that are visually similar but not identical, meaning they could be legitimate updates or genuine duplicates. That queue stood at roughly 18,000 images as of late June, according to figures presented at the June 25 Digital Services Commission meeting.
For residents and small business owners who file permits or service requests through SF311 or the planning portal, the most immediate benefit should arrive by the end of the third quarter: faster load times and a reduced likelihood of pulling the wrong version of a site photograph. The Department of Technology has committed to publishing a public dashboard tracking deduplication progress, with a launch date listed as August 1, 2026. Whether the city hits that target—and whether the cities watching from Amsterdam and London decide to adopt the same open-source toolkit—will determine how far this particular piece of civic infrastructure work travels beyond the Bay.