San Francisco's public-facing digital infrastructure is carrying dead weight — literally. Across the city's sprawling network of government websites, including the main SF.gov portal, the San Francisco Municipal Transportation Agency's rider pages, and the Department of Public Health's community resources site, thousands of duplicate images have piled up over the past decade, clogging content management systems, slowing page load times, and running up cloud storage bills that are ultimately paid by city residents.
The problem didn't arrive overnight. It is the product of at least three overlapping waves of digital migration: the shift away from the old sfgov.org architecture around 2015, a second major overhaul tied to the city's contract with Acquia for Drupal-based hosting that accelerated through 2019 and 2020, and then a pandemic-era scramble in which dozens of departments stood up emergency information pages with little central coordination. Each transition imported legacy media libraries without cleaning them first. Duplicate banner photos, redundant department seals, and recycled stock images of City Hall and the Embarcadero waterfront multiplied across databases with no automated deduplication in place.
The Machinery Behind the Mess
The city's Department of Technology, headquartered on Seventh Street in SoMa, has acknowledged the issue internally as part of its broader Digital Services work plan, though no public report quantifying the full scope has been released. Web managers at individual departments — from the Recreation and Parks Department, which maintains pages for Golden Gate Park programming, to the Office of Economic and Workforce Development in the Tenderloin — have long flagged duplicate assets as a time drain during routine content updates. Editors uploading a new image of, say, the Bayview–Hunters Point community garden often find three or four visually identical versions already in the media library, uploaded at different times under slightly different filenames.
The consequences go beyond aesthetics. Content management systems burdened with bloated media libraries slow editorial workflows. For a city that has spent recent years trying to digitize permitting and housing applications under initiatives like San Francisco's Housing Portal, launched in 2022 to streamline below-market-rate unit lotteries, backend sluggishness is not a trivial complaint. Google's Core Web Vitals benchmarks penalize pages with slow load times by pushing them down in search rankings — meaning residents searching for services may be landing on outdated or harder-to-find pages.
Storage costs, while difficult to isolate without a full audit, follow a predictable pattern in municipal cloud contracts. The city's master agreement with cloud vendors, negotiated through the Office of Contract Administration, typically charges on a per-gigabyte basis above baseline tiers. Industry estimates for comparable mid-sized municipal content libraries put unnecessary duplicate image storage in the range of several thousand dollars annually per major department — a modest figure per agency, but one that compounds across the city's roughly 50 public-facing departmental sites.
What Happens When the Cleanup Begins
Digital services teams in cities including Chicago and New York have addressed similar buildup through a combination of automated hashing tools — software that assigns a unique fingerprint to each image file and flags exact copies — and manual review workflows built into content governance policies. San Francisco's Digital Services unit has piloted automated asset auditing on at least one departmental subdomain since early 2025, though a citywide rollout has not been formally announced.
For residents and the journalists who cover City Hall, the practical upshot is this: the next time a city agency launches a redesigned website — and several are overdue, including the Planning Department's portal on Kearny Street — it should theoretically arrive leaner and faster. But that outcome depends on whether the Department of Technology mandates deduplication as a prerequisite for any new site launch, rather than leaving it as a discretionary step for individual webmasters. Without that requirement baked into procurement and deployment standards, the cycle of accumulation is likely to repeat with each successive platform migration. The city's budget cycle runs through June 30 each fiscal year; advocates for digital infrastructure reform say the window to attach cleanup mandates to the fiscal year 2027 technology budget closes sooner than most people think.