The Daily San Francisco

San Francisco news, every day

News

SF City Agencies Push to Fix Broken Digital Archives as Duplicate Image Problem Surfaces This Week

A long-ignored data hygiene issue in San Francisco's public-facing digital systems is drawing fresh scrutiny after city departments flagged thousands of redundant image files clogging municipal records platforms.

By San Francisco News Desk · Published 4 July 2026, 11:57 am

3 min read

SF City Agencies Push to Fix Broken Digital Archives as Duplicate Image Problem Surfaces This Week
Photo: Photo by Vision plug on Pexels

San Francisco's Department of Technology reported this week that an internal audit of the city's digital asset management systems uncovered an estimated 40,000 duplicate image files spread across at least three separate municipal platforms, including the public-facing SF.gov portal, the Planning Department's permit document library, and the Recreation and Parks Department's online venue gallery. The audit, completed in late June, was triggered after Rec and Parks staff noticed duplicate photographs appearing twice — sometimes three times — in public-facing pages for Golden Gate Park facilities and the Marina Green event calendar.

The problem matters right now because San Francisco is in the middle of a push to digitize backlogs of paper planning records, a process accelerated by the city's Housing Production Emergency declaration. With thousands of new permit applications expected to flow through the system over the next 18 months, duplicate and mislinked images create compounding errors: wrong building photos attached to the wrong parcel, outdated site photographs showing pre-demolition conditions, and redundant file storage running up cloud hosting costs that officials say are already under review.

Where the Problem Showed Up

The Planning Department's permit portal, accessible through its offices at 49 South Van Ness Avenue in SoMa, was identified as the highest-risk system. Staff there discovered that automated ingestion scripts — introduced in early 2025 to speed up document uploads — were saving both an original scan and a compressed duplicate every time a file was processed. For a busy permit cycle covering neighborhoods like the Inner Sunset and the Tenderloin, that means months of redundant image data sitting on city servers without any deduplication check in place.

The San Francisco Public Library's digital collections arm, based at the main branch on Larkin Street in Civic Center, flagged a separate but related issue: its oral history and neighborhood documentation project, which has been digitizing photographs from the Western Addition and Chinatown since 2023, found roughly 1,200 image pairs where two scans of the same physical photograph were uploaded under different metadata tags. Librarians said the duplicates weren't caught earlier because the project used two different scanning contractors over its three-year run, and the handoff between vendors didn't include a cross-check protocol.

What the City Is Doing About It

The Department of Technology is now piloting a deduplication tool from a vendor it has not yet publicly named under a short-term contract that runs through September 30, 2026. The tool uses perceptual hashing — a technique that identifies visually identical or near-identical images regardless of file name or metadata — and has already been tested on a sample set of 5,000 files from the Planning Department archive. Early results flagged a 22 percent duplication rate in that sample, a figure city staff called higher than expected.

Cloud storage is not cheap at municipal scale. San Francisco's city government migrated much of its document infrastructure to cloud hosting between 2021 and 2023, and storage costs have risen steadily since. Redundant image files, particularly high-resolution scans attached to planning and permit records, represent a non-trivial share of that overhead — though the Department of Technology has not released a specific dollar figure tied to the duplication problem ahead of an expected budget report later this month.

For residents and contractors who use the Planning Department's online portal to track permit status on projects in neighborhoods like the Excelsior or Outer Richmond, the practical advice this week is straightforward: if a property record shows a photograph that looks wrong or outdated, file a correction request through the department's document review queue rather than assuming the portal reflects current conditions. The deduplication pilot is expected to produce a full remediation plan by August 15, with a city-wide rollout of the new protocol targeted for the fourth quarter of 2026. The Public Library's digital collections team said it plans to complete its own manual review of the Western Addition and Chinatown photograph archive by the end of July.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.