San Francisco's city government is sitting on a sprawling archive of redundant digital imagery — duplicate photos, scanned documents, and repeated graphics files stored across dozens of departmental servers — a problem that accumulated quietly over more than a decade and is now costing the city measurable time and money to reverse. The Department of Technology, headquartered on 1 Dr. Carlton B. Goodlett Place at City Hall, has been working through a phased cleanup initiative that began formally in late 2024, and staff are still identifying how deep the redundancy runs.
The issue matters now because San Francisco is mid-way through a broader push to modernize its public-facing digital infrastructure. The SF Digital Services office, which operates under the City Administrator, has been rebuilding key resident-facing portals since 2023, and duplicate image files embedded in legacy content management systems are slowing that migration. Every redundant asset has to be reviewed before it can be archived or discarded — a manual process that ties up staff hours and delays the launch of updated service pages that residents and businesses depend on.
How the Pile Grew
The roots of the problem stretch back to the city's first serious wave of web digitization in the early 2010s, when individual departments — Planning, Public Works, the Municipal Transportation Agency — each built out their own content workflows with little central coordination. The SF Planning Department, based on Mission Street, ran its own document management system. The SFMTA, headquartered at 1 South Van Ness Avenue, maintained a separate asset library. Neither talked to the other in any automated way. Photographers hired for public events, construction progress shots at Central Subway stations, and permit application scans piled up in parallel silos, often uploaded multiple times by different staff members working from different offices.
The pandemic accelerated things in the wrong direction. When roughly 35,000 city employees shifted to remote work in March 2020, departments leaned heavily on shared cloud drives — Google Drive folders and SharePoint libraries set up quickly without naming conventions or deduplication protocols. Image files were copied, re-copied, and re-uploaded across platforms. By the time offices reopened on a hybrid schedule in 2022, the redundancy was structural. The Department of Technology's internal audits, shared in summary form with the Budget and Legislative Analyst's office, have described the cleanup effort as touching multiple petabytes of stored city data, though the precise tally of image-specific duplication has not been released publicly.
The financial weight is real. Cloud storage is not free, and San Francisco pays enterprise licensing rates for the platforms hosting this material. The city's technology budget for fiscal year 2025-2026 sits at roughly $130 million, according to the Controller's Office budget summary, and storage optimization is listed as one of several efficiency targets. Separately, the SF Public Library's digital branch, which manages the San Francisco History Center's online photograph collection at the main branch on Larkin Street, spent part of its 2025 digitization grant — funded through the California State Library — resolving duplicate scans in its own catalog, a smaller but illustrative parallel to the citywide problem.
The Path Forward
The cleanup is moving in stages. The Department of Technology's current approach prioritizes high-traffic departments first — those whose websites see the most public use, including SF.gov portal pages handling permit applications, Muni trip planning tools, and the Recreation and Park Department's reservation system for facilities like the Moscone Recreation Center in the Marina. Automated deduplication software flags probable matches; human reviewers confirm before anything is deleted. That two-step process is deliberate but slow.
Residents and small businesses that interact with city web services may notice periodic disruptions as image assets are swapped out or consolidated — broken photo links on older web pages are a common side effect. The Department of Technology has advised webmasters across departments to audit their content libraries before the next major SF.gov platform update, expected in early 2027. Staff at the SF Digital Services office have also been developing a shared asset library standard — essentially a common folder structure and naming protocol — that new departments will be required to follow. The goal is to stop the duplicate problem from rebuilding itself one uploaded photo at a time.