The Daily San Francisco

San Francisco news, every day

News

SF's Digital Archives Push Forward on Duplicate Image Replacement — Here's What Changed This Week

City departments and cultural institutions across San Francisco accelerated efforts to purge and replace redundant digital imagery from public records and online platforms, exposing both technical hurdles and funding gaps.

By San Francisco News Desk · Published 4 July 2026, 12:16 pm

3 min read

SF's Digital Archives Push Forward on Duplicate Image Replacement — Here's What Changed This Week
Photo: Photo by Mariya Eskina on Pexels

San Francisco's main municipal archiving program took a concrete step forward this week when the Department of Technology flagged more than 14,000 duplicate image files embedded in public-facing city databases, triggering a formal replacement and consolidation process that staff had been preparing for since late 2025. The issue is narrow but consequential: outdated, low-resolution, or redundant photographs lodged inside government websites and open-data portals distort search results, inflate storage costs, and in some cases present residents with contradictory information about city facilities and programs.

The timing is not accidental. San Francisco's budget cycle, which closes in mid-July, created pressure on departments to demonstrate digital housekeeping before new fiscal-year allocations are locked in. The Mayor's Office of Innovation, housed on Van Ness Avenue, has been pushing agencies to adopt a unified digital asset standard since the spring, and this week's audit came partly in response to that directive. With AI-assisted cataloguing tools now available through a city contract with a San Jose–based vendor, officials moved faster than in prior years when the work was done manually.

Where the Backlog Is Worst

The San Francisco Public Library system, which maintains digitised historical collections at its main branch on Larkin Street, identified roughly 3,200 duplicate image records in its online catalogue as of Wednesday — many of them scanned photographs that were uploaded more than once during migration projects between 2018 and 2022. Librarians working through the Preserve SF initiative began replacing the lower-quality duplicates with higher-resolution master files this week, a process the library estimates will take through the end of August.

The San Francisco Recreation and Parks Department faces a different version of the problem. Its web team discovered that facility images for more than 60 parks — including Golden Gate Park's Koret Children's Quarter and the Excelsior Playground on Russia Avenue — had been uploaded to the department's content management system in multiple conflicting versions, some reflecting renovation states from before 2020. Outdated images showing closed amenities or pre-renovation layouts have caused confusion among residents booking permits online. A replacement batch covering the highest-traffic park pages was pushed live on July 2.

The city's open-data portal, DataSF, is separately running a deduplication sweep across image assets tied to its planning and building permit datasets. That project, managed out of the Office of the City Administrator, started June 23 and is expected to process approximately 40,000 image records before the end of the month.

Costs, Timelines, and What Residents Should Know

Storage is money. The Department of Technology's internal estimates put the recurring annual cost of maintaining unreferenced duplicate files across city systems at roughly $180,000 — a figure that includes cloud hosting fees and staff time spent troubleshooting broken or conflicting image links. Eliminating confirmed duplicates could reduce that number by an estimated 30 percent, according to a budget memo circulated to department heads in May, though that memo has not been released publicly.

For residents, the most immediate practical effect is reliability. If you have searched the SF Planning Department's map portal on Mission Street for photos of a specific parcel and received mismatched or blank images, that is frequently a duplicate-conflict problem rather than a missing record. The department's GIS team says the image replacement effort should resolve the majority of those blank-tile errors for parcels in the Tenderloin and SoMa districts by mid-July.

Institutions beyond city government are watching. The Internet Archive, whose physical operations are based in the Presidio, runs its own internal deduplication processes on a separate scale, but staff there have informally engaged with the city's vendor on shared methodology. No formal partnership has been announced.

The next checkpoint is July 18, when the Department of Technology is scheduled to present a progress report to the city's Committee on Information Technology. Residents who encounter broken or duplicate images on official city portals can flag them through the SF311 app, which routes reports directly to the relevant department's web team — a feedback loop that, until this year, most departments were not consistently monitoring.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Francisco

This article was produced by the The Daily San Francisco editorial desk and covers news in San Francisco. See our editorial standards for how we use AI.

The Daily San Francisco brief

The day's San Francisco news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to San Francisco news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Francisco and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily San Francisco

More in News

Enjoyed this story? Get tomorrow's briefing free.