The Daily San Francisco

San Francisco news, every day

News

How San Francisco's City Websites Ended Up Buried Under Thousands of Duplicate Images — and What It's Costing Taxpayers

Years of decentralized web publishing, staff turnover, and emergency content migrations have left municipal digital assets in a state of quiet chaos.

By San Francisco News Desk · Published 4 July 2026, 12:45 pm

3 min read

How San Francisco's City Websites Ended Up Buried Under Thousands of Duplicate Images — and What It's Costing Taxpayers
Photo: Photo by Mo Eid on Pexels

San Francisco's city government is sitting on a digital hoarding problem it created one rushed upload at a time. Across the network of departmental websites administered under the sf.gov umbrella, duplicate image files have accumulated over more than a decade of fragmented content management — the same headshots, the same stock photos of City Hall's rotunda, the same neighborhood maps, uploaded again and again by staffers who had no way of knowing the file already existed somewhere in the system.

The reckoning is overdue. The Department of Technology's Digital Services division, which oversees the city's Drupal-based web infrastructure, began a formal audit of digital asset repositories in early 2026 after a broader review of sf.gov's backend flagged storage inefficiencies. The review found redundant image files distributed across at least 14 departmental content libraries — a product of years when each agency effectively ran its own publishing operation with minimal coordination.

How the Duplication Problem Was Built, File by File

The roots go back to the early 2010s, when departments from the San Francisco Municipal Transportation Agency to the Department of Public Health each maintained siloed websites on separate CMS platforms. When the city launched its unified sf.gov redesign project around 2019, content teams migrated material in batches, often under deadline pressure, without deduplication protocols in place. Staff uploaded existing images from old systems without checking whether those images had already been transferred.

The problem compounded during the COVID-19 period. Between March 2020 and mid-2022, city agencies pushed out rapid updates to public health pages, permitting portals, and emergency services directories. Turnover among web content coordinators — some departments lost their entire digital communications staff within 18 months — meant institutional knowledge about existing asset libraries evaporated. New hires simply uploaded what they needed. The Tenderloin Economic Development Project's web pages, to name one affected program, ended up with multiple copies of the same infographic illustrating neighborhood service zones.

Storage costs are not abstract. Commercial cloud storage contracts for government entities in California typically run between $0.02 and $0.05 per gigabyte per month at scale, and the city's Digital Services office has not yet published a figure for exactly how many redundant gigabytes the audit uncovered. But municipal IT administrators in comparable jurisdictions have found duplicate media libraries consuming anywhere from 15 to 40 percent of total CMS storage allocation — dead weight that slows page load times, complicates search indexing, and inflates licensing costs for digital asset management tools.

The SFMTA, which operates one of the most heavily trafficked city sub-sites, publishing real-time route data alongside public meeting archives and project documents for corridors including Van Ness Avenue and the Central Subway, is understood to be one of the departments with the deepest legacy image backlog. Muni's web team has expanded in recent years, but the underlying asset organization was never retroactively cleaned up after the sf.gov migration.

What Comes Next for the City's Digital Housekeeping

The Digital Services division is piloting an automated deduplication tool integrated into the city's existing Drupal environment. The tool flags image files with identical or near-identical hash values and queues them for review before deletion — a step that prevents the kind of automated purge that could accidentally strip images still embedded in legacy pages not yet updated to the new template system.

Staff training is the other half of the fix. Starting in August 2026, the city's new Digital Publishing Standards require content coordinators across all departments to run a keyword search of the central media library before uploading any new image. The standard is modeled on practices already in place at the San Francisco Public Library's digital collections unit, which has maintained a deduplicated image archive since 2021.

For residents, the practical consequence of getting this right is faster-loading city web pages — something that matters most on mobile, where a disproportionate share of lower-income San Franciscans access government services. Getting there requires the less glamorous work of clicking through thousands of thumbnail images and deciding which version of a photograph of the Civic Center steps actually stays.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.