The Daily San Francisco

San Francisco news, every day

News

How San Francisco's Digital Archive Got Buried Under Thousands of Duplicate Images — and What the City Is Doing About It

Years of decentralized content management across dozens of city departments left SF.gov and agency websites clogged with redundant files, inflating storage costs and slowing public access to information.

By San Francisco News Desk · Published 4 July 2026, 12:23 pm

4 min read

How San Francisco's Digital Archive Got Buried Under Thousands of Duplicate Images — and What the City Is Doing About It
Photo: Photo by Brett Sayles on Pexels

San Francisco's sprawling network of municipal websites is carrying a quiet structural problem: tens of thousands of duplicate image files uploaded across city department portals over more than a decade, creating a redundant digital infrastructure that technology managers have been working since late 2024 to untangle. The effort — formally tied to the city's ongoing SF.gov consolidation initiative run through the Department of Technology — represents one of the more unglamorous but consequential clean-up jobs in local government IT.

The issue matters now because the city is in the middle of an aggressive push to migrate legacy departmental sites onto a unified content management platform. That migration, which the Department of Technology has been steering since the SF.gov redesign launched in phases beginning around 2021, keeps surfacing the same problem: when agencies operated their own siloed web presences for years, their staff uploaded images — department logos, neighborhood photos, event banners, headshots — with no centralized library, no naming conventions, and no deduplication checks. The same photograph of City Hall's rotunda might exist in 40 slightly different file sizes across the Planning Department, the Mayor's Office, and the Office of Economic and Workforce Development portals.

How the Mess Accumulated

The roots go back to the mid-2000s, when individual departments began spinning up their own websites using a patchwork of vendors and content management systems. The Department of Public Health ran its own Drupal installation. The Municipal Transportation Agency — which oversees Muni — maintained separate image libraries for its dozens of service lines. The Recreation and Parks Department, managing more than 220 parks citywide, uploaded location photography through a system that had no cross-reference capability with anything else in city government.

Each upload cycle created new variants. A communications staffer at, say, the Planning Department on South Van Ness Avenue would resize an aerial photograph of the Mission District for a press release, save it under a new filename, and upload it — unaware that three versions of the same source image already existed in the system. Multiply that behavior across 50-plus departments over 15 years, and the archive bloat becomes structural rather than incidental.

The city's Civic Bridge program, which pairs city staff with private-sector technologists for short-term projects, flagged the duplicate image problem as a secondary concern during a 2023 engagement focused on SF.gov's search functionality. That finding gave Department of Technology project managers a documented baseline for the consolidation work that followed. Storage costs for the city's managed web hosting contracts, while not publicly broken out in budget documents at a granular level, are bundled into the Department of Technology's annual appropriation, which the Controller's Office listed at roughly $98 million for fiscal year 2025-26.

The Cleanup Effort and What Comes Next

The practical work of duplicate image replacement involves more than simply deleting files. When a redundant image is removed, every page that referenced it breaks unless a redirect or asset substitution is in place first. The Department of Technology's web services team has been running automated scripts to identify hash-matched duplicates — images that are bit-for-bit identical — and then manually reviewing near-duplicates flagged by similarity algorithms before any replacements go live.

The SF.gov team has also established a centralized asset library accessible to content editors across departments, a system modeled loosely on approaches taken by govtech consolidations in cities like New York and Los Angeles over the past several years. Training sessions for departmental communications staff have been held at City Hall and at the Department of Technology's offices on Stevenson Street in SoMa.

For residents, the practical payoff is faster page loads on city sites, fewer broken images on archived press releases, and more reliable search results when looking up services. For budget watchdogs on the Board of Supervisors, the cleanup represents a chance to reduce redundant storage line items before the next contract renewal cycle. The Department of Technology has not yet published a completion timeline for the full deduplication effort, but project managers have indicated internally that the highest-traffic department sites — including SF.gov's main portal and the SFMTA's rider-facing pages — are the near-term priority. The deeper archival cleanup of older, lower-traffic agency pages is expected to extend into 2027.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.