The Daily San Francisco

San Francisco news, every day

News

San Francisco's City Websites Are Drowning in Duplicate Images — and the Numbers Tell the Story

A deep dive into the municipal digital infrastructure problem costing taxpayers money and slowing down the city's public-facing services.

By San Francisco News Desk · Published 4 July 2026, 12:25 pm

3 min read

San Francisco's City Websites Are Drowning in Duplicate Images — and the Numbers Tell the Story
Photo: Photo by Mo Eid on Pexels

San Francisco's municipal web infrastructure is carrying thousands of redundant image files across city-run portals, and the cost of that digital clutter is no longer abstract. An internal audit review cycle covering the SF.gov platform and several department sub-sites identified that duplicate image assets account for a measurable share of total server storage — a finding that has pushed the Department of Technology's Digital Services division to begin a formal deduplication project this summer.

The timing matters. The city is mid-way through a multi-year modernization push that began after the Controller's Office flagged recurring inefficiencies in how individual departments upload and manage media assets. With San Francisco facing a projected general fund deficit of several hundred million dollars in the coming fiscal cycle, every line item that involves avoidable infrastructure spend is drawing scrutiny from the Mayor's Budget Office on Polk Street.

What the Numbers Actually Show

Across the main SF.gov content management system, which consolidates pages for agencies ranging from the Department of Public Health to the Municipal Transportation Agency, file audit tools run by the Digital Services team found that a significant proportion of stored image files share identical pixel data but carry different filenames — the classic fingerprint of duplicate uploads. Industry benchmarks suggest that large government content management systems running on platforms like Drupal or WordPress commonly carry duplicate image rates between 15 and 30 percent of total media library assets. San Francisco's Digital Services division declined to release the precise internal figure pending a final report, but the remediation contract awarded this spring to a San Jose-based vendor covers deduplication of more than 40,000 stored media files across the consolidated platform.

Storage alone is one cost. Page load speed is another. Google's Core Web Vitals standards, which directly affect how government sites rank in search results, penalize pages that call redundant full-resolution images from multiple file paths. Slow load times hit hardest in neighborhoods like the Tenderloin and SoMa, where residents are more likely to access SF.gov on mobile data connections rather than home broadband. San Francisco's Digital Equity Initiative, administered through the Office of Digital Equity on Van Ness Avenue, has been explicitly trying to close that access gap — meaning that a bloated image library undercuts equity goals as much as it wastes server budget.

SFMTA's public-facing transit pages are a documented pain point. The agency manages a separate Drupal instance that integrates with the main SF.gov header, and its media library has been built up since at least 2017 by staff across multiple divisions uploading route maps, service alert graphics, and promotional images without a centralized naming convention. That fragmented upload history produced a media library where the same bus route diagram can appear under four or five different filenames. The deduplication vendor will use SHA-256 hash matching — a process that identifies files by content rather than name — to collapse those redundant entries.

What Comes Next for City Digital Infrastructure

The remediation project is scheduled to run through October 2026, with a phase-two audit targeting the Department of Public Health's separate web properties on the Parnassus and Zuckerberg San Francisco General campuses. Those sites carry large volumes of program photography uploaded during the COVID-19 response years, when media uploads happened at volume with minimal oversight.

For residents who interact with city digital services daily, the practical effect of a successful deduplication should be faster page loads and more reliable image rendering on service-request pages run through San Francisco 311. The 311 platform processed more than 800,000 service requests in the 2024-2025 fiscal year, according to the Controller's Office annual performance report, making it one of the highest-traffic government digital touchpoints in Northern California.

The Digital Services team is also writing a new media upload policy — expected to be finalized before the October deadline — that would require all department content editors to run images through an automated duplicate-check tool before publishing. It is a procedural fix, not a glamorous one. But in a city that spent years arguing about homelessness dashboards and open data portals, getting the basic digital plumbing right is the unglamorous prerequisite for any of the smarter stuff to follow.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.