The Daily San Francisco

San Francisco news, every day

News

SF's Digital Archive Overhaul Hits a Fork in the Road: What Happens Next and the Key Decisions Ahead

City agencies and cultural institutions face a tangle of competing systems, redundant image files, and budget pressure — and the choices made this summer will shape public records access for years.

By San Francisco News Desk · Published 4 July 2026, 11:40 am

3 min read

San Francisco's push to modernize its public digital archives has stalled at a critical juncture, with city departments holding tens of thousands of duplicate image files across incompatible platforms and no unified policy yet in place to resolve them. The problem, long acknowledged inside City Hall but rarely discussed publicly, is now forcing a decision: who owns the cleanup process, who pays for it, and what standard will govern what gets kept.

The timing matters. The city's Department of Technology is mid-way through a broader IT consolidation effort that began in fiscal year 2024-25, folding scattered departmental servers into centralized cloud infrastructure. That migration has surfaced the duplicate-image problem at scale. Agencies ranging from the San Francisco Public Library's San Francisco History Center at Larkin Street to the Planning Department's environmental review division have discovered overlapping image catalogues — some files duplicated dozens of times across shared drives, city portals, and legacy content management systems that predate the current mayoral administration.

The Stakes for Cultural Memory and City Services

For the History Center, which holds one of the most significant municipal photography collections on the West Coast, the issue is more than a storage bill. Archivists must determine whether duplicates represent genuinely identical files or slightly different scans of the same physical document — a distinction that matters enormously for preservation. A flawed automated deduplication run in 2023 reportedly flagged thousands of irreplaceable photographs of the Tenderloin, the Fillmore District, and pre-earthquake structures along Market Street for deletion before staff intervened.

The San Francisco Arts Commission, which manages the city's Civic Art Collection database, faces a parallel challenge. Its image repository — used by curators, researchers, and the public through the city's online portal — contains multiple versions of the same artwork photographs taken under different lighting conditions or by different contractors over the years. Deciding which version is canonical requires both curatorial judgment and technical standardization, and the Commission has not yet published a deduplication protocol as of this Fourth of July weekend.

City budget documents for fiscal year 2025-26 allocated roughly $4.2 million to the Department of Technology's cloud migration program overall, but line-item funding specifically for archival deduplication and metadata remediation was not broken out separately in publicly available budget summaries reviewed by The Daily San Francisco. That ambiguity is itself part of the problem: without a discrete budget and a named project owner, the work keeps getting deferred.

The Decisions That Will Define the Outcome

Three choices are now converging on a tight timeline. First, the city must decide by the end of the summer fiscal quarter whether to procure a dedicated digital asset management platform — vendors have been in conversations with the Department of Technology since at least March — or to extend existing contracts with the current patchwork of systems. A procurement decision pushed past September risks losing the migration momentum built up over the past 18 months.

Second, the Planning Department's imminent rollout of its updated Central SoMa environmental review portal, expected in late 2026, depends on a clean image library. Duplicate or mislabeled photographs of project sites along Folsom Street and Brannan Street have already created inconsistencies in at least two recent environmental impact reports, according to city planning meeting minutes published online.

Third, and most consequential for the public, is whether San Francisco follows the lead of cities like New York — which published a formal open-data image deduplication standard in 2024 — or continues to treat the issue as a back-office IT problem invisible to residents. Advocates at the Internet Archive, headquartered on Funston Avenue in the Richmond District and one of the city's most prominent digital preservation institutions, have argued publicly that municipal image data deserves the same transparency standards applied to other public records.

The Department of Technology is expected to present recommendations to the city's Committee on Information Technology this month. Whatever framework emerges will set the template not just for existing duplicates but for every photograph, scan, and rendering the city produces going forward — a quiet but consequential piece of infrastructure that touches everything from housing permit applications in the Sunset District to the preservation of San Francisco's visual history.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.