The Daily San Francisco

San Francisco news, every day

News

San Francisco's Digital Archive Crisis: What Happens Next With Thousands of Duplicate City Images

A backlog of redundant photos is clogging municipal databases and forcing key decisions about how the city manages, purges, and preserves its visual records.

By San Francisco News Desk · Published 4 July 2026, 12:06 pm

3 min read

San Francisco's Digital Archive Crisis: What Happens Next With Thousands of Duplicate City Images
Photo: Photo by Vision plug on Pexels

San Francisco's Department of Technology is sitting on a problem it can no longer defer: thousands of duplicate images spread across municipal servers are consuming storage, slowing workflows, and complicating a broader push to modernize the city's digital infrastructure. Officials are expected to present a remediation framework to the city's Committee on Information Technology before the end of the third quarter of 2026, according to publicly posted committee agendas reviewed by The Daily San Francisco.

The timing matters. The city is mid-way through a five-year Digital Services Modernization initiative that touches everything from permitting portals used daily at the Department of Building Inspection on Sansome Street to the public-facing dashboards at the SF Planning Department's offices on Mission Street. Unresolved duplicate-image clutter is slowing the migration of legacy data to newer cloud-based systems — and the longer the backlog grows, the higher the eventual clean-up cost.

How the Backlog Built Up

The problem is largely a product of the city's own growth in digital communications. Over the past decade, every department from the Municipal Transportation Agency to the Department of Public Health began generating its own photography — street conditions, facility inspections, outreach events — without a shared tagging or deduplication protocol. Images uploaded to the city's content management systems were rarely cross-referenced, and automatic syncing between platforms created cascading copies.

The San Francisco Public Library's Digital Collections unit on Larkin Street faced a smaller version of this same issue in 2022 when it began digitizing historical holdings. Librarians there developed an in-house hash-matching workflow to identify and flag identical files before ingesting them into the catalog. That project, completed over roughly 14 months, is now being studied by the Department of Technology as a possible model for the citywide effort.

Storage costs are real. Municipal cloud contracts billed under the city's enterprise agreement with its primary infrastructure vendor are priced per gigabyte at scale — redundant files directly inflate those line items. A 2024 report from the City Controller's Office found that city departments collectively spent more than $4.2 million on cloud storage in fiscal year 2023-24, a figure that has risen each year since the pandemic accelerated remote-work adoption and digital-first service delivery.

The Decisions Ahead

Three questions are expected to dominate the coming committee discussions. First: who owns the deduplication process? Centralizing it under the Department of Technology is the cleanest option operationally, but individual departments have historically guarded control over their own data. Second: what gets deleted versus archived? Some duplicate images are genuinely redundant. Others are near-duplicates that carry metadata differences — timestamps, geotags, versioning notes — that may matter for legal or audit purposes, particularly for files tied to Planning Department permit records or Police Department body-camera adjacent documentation. Third: what tool or vendor gets the contract?

The SF-based nonprofit Gray Area Foundation for the Arts, which operates out of Mission Street in the Mid-Market corridor, has no formal role in the city process, but its experience running open-source digital asset management tools for cultural institutions is the kind of local knowledge city staff are beginning to consult informally. Meanwhile, at least two technology vendors with offices in SoMa have reportedly responded to a preliminary market survey circulated by the Department of Technology in May 2026.

Whatever framework emerges from the committee, implementation will unfold in phases. The first phase is expected to target high-volume, low-sensitivity image libraries — think park department event photography and public works documentation — where deletion risk is minimal and storage savings are immediate. More sensitive records, including anything touching the Department of Public Health's homelessness outreach documentation or permit-dispute files at Building Inspection, will require legal review before any purge.

The committee's next scheduled meeting is July 22 at City Hall. Department of Technology staff are expected to present a draft scope-of-work document at that session, which will be open to public comment. Anyone tracking city contracting decisions should watch the city's procurement portal, SF City Partner, where a formal Request for Proposals is likely to appear before September 30.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.