The Daily San Francisco

San Francisco news, every day

News

How San Francisco's City Databases Became Cluttered With Ghost Images — And Why Fixing It Now Costs Real Money

Years of decentralized digital record-keeping across dozens of municipal departments left city systems riddled with duplicate and orphaned images, and the bill for cleaning them up is landing in 2026.

By San Francisco News Desk · Published 4 July 2026, 11:43 am

3 min read

San Francisco's Department of Technology is working through a citywide audit of digital asset libraries that have accumulated duplicate images across at least 23 separate municipal systems since the early 2010s, when individual departments were largely left to manage their own file storage without shared protocols. The problem, long treated as a low-priority housekeeping matter, has grown expensive enough to reach the budget desk.

The timing is not accidental. A wave of cloud migration contracts signed between 2021 and 2023 — part of the city's push to modernize infrastructure following the COVID-19 pandemic — exposed just how redundant those legacy archives had become. Moving bloated, unstructured data to the cloud costs money per gigabyte, and duplicate image files, often uploaded multiple times by different staff across different platforms, were quietly inflating migration bills.

How the Mess Accumulated

The roots go back to decisions made — or not made — during San Francisco's rapid expansion of digital public-facing services in the early 2010s. SFGov.org, the city's main public portal, was redesigned in 2013 without a unified digital asset management system attached to it. Individual departments — Recreation and Parks, the Planning Department on South Van Ness Avenue, the San Francisco Municipal Transportation Agency on South Van Ness, the Department of Public Health — each built or bought their own content management tools. Staff uploaded photos, maps, and graphics independently, with no deduplication layer and no shared taxonomy.

By the time the city's DataSF program, which sits inside the City Administrator's Office at City Hall, began pushing departments toward standardized open-data practices around 2015, the image problem was already baked in. DataSF focused on structured tabular data — spreadsheets, databases, shapefiles — not the unstructured image libraries sitting inside department intranets and content management systems. Those libraries kept growing.

The SFMTA alone, which manages Muni bus and rail lines serving roughly 700,000 daily boardings before the pandemic, maintained separate image repositories for communications, operations, and capital projects. Similar fragmentation existed inside the Planning Department, which handles tens of thousands of permit applications per year, each of which can include multiple uploaded photos of properties across neighborhoods from the Tenderloin to the Outer Sunset.

The Cost of Doing Nothing Changed

Cloud economics made the status quo untenable. Storage on legacy on-premises servers had a fixed cost that made duplication feel free at the margin. Cloud billing doesn't work that way. The city's Office of Contract Administration published general guidelines in early 2025 requiring departments to certify data hygiene standards before finalizing cloud migration contracts — a policy that effectively put a price tag on every redundant file for the first time.

Estimates from comparable mid-sized U.S. municipal cloud migrations suggest duplicate and orphaned files can account for 15 to 30 percent of total unstructured data volume, though San Francisco has not yet released its own citywide figure. The Department of Technology began its formal audit in the first quarter of 2026, contracting with vendors to run deduplication scans across shared city network drives and public-facing content systems.

The practical consequences ripple beyond storage costs. Duplicate images in permit records can slow down Planning Department staff processing applications at the Permit Center on 49 South Van Ness. Outdated or conflicting photos attached to city infrastructure records can create confusion for crews doing repairs. And duplicate visual assets in public communications have, on at least a few documented occasions flagged during internal reviews, resulted in the wrong version of a graphic appearing on city websites.

Departments have been given a phased remediation timeline running through the end of fiscal year 2026-27. The practical advice for San Franciscans interacting with city digital systems right now: if you are submitting permit applications or documents through any SFGov portal, upload only the specific files requested and avoid resubmitting images already attached to a record. City IT staff say redundant uploads from the public side compound the problem that internal workflows created in the first place.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.