San Francisco's Department of Technology has identified more than 340,000 duplicate image files sitting across the city's network of public-facing websites — from the Planning Department's permit portal on Seventh Street to the SF Public Library's digital catalog system — a figure that officials say is contributing to measurable slowdowns on platforms that residents use daily to access services, file complaints, and pay fees.
The sheer scale of the problem reflects years of decentralized content management, where individual departmental staff uploaded the same asset — a logo, a neighborhood map, a headshot — independently and repeatedly, with no automated deduplication layer to catch redundancies before they compounded. The city's Unified Digital Infrastructure initiative, launched in January 2025, was supposed to rein in exactly this kind of sprawl, but its rollout has been uneven across agencies.
What the Data Actually Shows
Internal audits reviewed by The Daily San Francisco show that duplicate image files account for roughly 18 percent of total storage consumption across sf.gov and its affiliated subdomains — a proportion that IT administrators describe as far above what peer municipalities typically report. For comparison, the City of Chicago completed a similar deduplication audit in 2024 and found its redundancy rate was closer to 7 percent before remediation. San Francisco's figure has not been formally published, but documents circulating within the Department of Technology put the raw storage waste at approximately 4.2 terabytes as of the most recent quarterly review in April 2026.
Storage costs on the city's cloud infrastructure — the city migrated to a hybrid cloud model with a contract awarded in fiscal year 2023-2024 — are billed on a per-gigabyte basis. At current contracted rates, that 4.2 terabytes of redundant image data represents an estimated $9,800 in unnecessary annual storage spend, a modest figure on its own, but officials note the downstream costs are harder to quantify: slower page load times on high-traffic portals directly affect whether residents complete permit applications or abandon them midway. The SF Planning Department's online permit system recorded an average page load time of 6.4 seconds during peak hours in the first quarter of 2026, according to department performance metrics — well above the 3-second threshold that usability researchers widely cite as the point where user drop-off accelerates.
The Mission District-based nonprofit SF Digital Equity Coalition, which advocates for accessible city services among lower-income and non-English-speaking residents, has raised the issue in public comment at three Board of Supervisors committee hearings since February. Bloated municipal websites disproportionately affect residents on slower connections — a concern particularly acute in Visitacion Valley and the Tenderloin, two neighborhoods where the city's own broadband mapping data shows significantly lower average connection speeds than the citywide median.
What Cleanup Looks Like — and What It Costs
The Department of Technology began a phased deduplication project in March 2026, starting with the highest-traffic portals: the SF311 service request site, the Office of the Treasurer and Tax Collector's online payment system, and the Recreation and Parks Department's reservation platform. A vendor contract for automated image hashing and replacement tools was awarded to a San Jose-based firm for $187,000, covering the initial 18-month engagement.
The process works by running a perceptual hash across every stored image, flagging files that are identical or near-identical, replacing redundant copies with a single canonical file path, and updating every instance across the content management system — a workflow that sounds straightforward but grows complex when legacy pages have been hand-coded by staff who are no longer with the city. The Planning Department alone has more than 1,200 active web pages, some dating to 2011, that require manual review before any automated replacement can proceed.
For residents and small business owners who use these portals regularly, the practical upshot is that load times on the affected sites should improve measurably by the end of the third quarter of 2026, when the first phase of deduplication is scheduled to complete. The Department of Technology has committed to publishing before-and-after performance metrics on its public dashboard at sf.gov/departments/technology — giving San Franciscans a rare, data-grounded look at whether a back-end cleanup project actually delivers the speed improvements officials are promising.