San Francisco's Department of Technology is sitting on a digital records problem years in the making: thousands of duplicate and mislabeled images scattered across the city's shared municipal archive system, touching everything from Planning Department permit files to public health clinic records stored at the Civic Center campus. The question now is who cleans it up, how fast, and at what cost to a city budget already stretched thin heading into fiscal year 2027.
The issue matters right now because San Francisco is in the middle of consolidating several legacy data systems under a broader IT modernization push that began in earnest in 2024. When duplicate images pile up inside document management platforms, they slow retrieval times, inflate cloud storage costs, and—most critically for agencies like the Department of Building Inspection on McAllister Street—create conflicting version histories on permit applications. A single Mission District renovation permit can carry four or five attached site photographs that are functionally identical, each uploaded separately by different staff at different stages of review.
The Scale of the Backlog and Where It Lives
City IT auditors flagged the duplication issue in a departmental memo circulated in spring 2026, according to public records reviewed by The Daily San Francisco. The memo identified the Planning Department's Accela permit portal and the Department of Public Health's electronic document repository—both of which store images uploaded by outside contractors as well as city staff—as the two systems carrying the highest concentration of redundant files. The Planning Department alone processes roughly 30,000 permit applications annually, each of which can attach multiple photographs at intake, during field inspection, and at final sign-off.
Cloud storage is not free. The city's enterprise agreement with its primary storage vendor, renewed in January 2025, ties per-gigabyte costs to consumption bands. IT officials have previously noted internally that unnecessary duplication pushes the city into higher pricing tiers, though the department has not publicly released a line-item figure for the waste. The Department of Technology's total IT infrastructure budget for FY2026 was set at approximately $98 million, per the Controller's Office budget summary published last fall.
The San Francisco Public Library's digital collections team at the main branch on Larkin Street has dealt with a parallel version of this problem for years, building a deduplication workflow into its own digitization pipeline after a 2019 audit found redundant scans consuming storage that should have gone to new acquisitions. That model—automated hash-matching at upload, with human review for borderline cases—is now being studied by the Department of Technology as a potential template for the permit and health systems.
Key Decisions Ahead This Summer
Three decisions are converging over the next 60 days. First, the Department of Technology must decide whether to pursue an automated deduplication tool bolted onto existing systems or to contract out a one-time manual purge. Automated tools can cost between $40,000 and $150,000 for initial licensing and integration at the scale San Francisco's systems require, based on publicly available pricing from enterprise document management vendors; a manual contract could run higher depending on scope. Second, the Planning Commission—which holds hearings at City Hall's Room 400—will need to sign off on any change to how permit attachments are handled inside Accela, since workflow changes affect public-facing permit tracking. A proposal is expected before the commission no later than September. Third, the Mayor's Office of Civic Innovation, which has been coordinating the broader IT modernization effort, must determine whether deduplication gets folded into the larger system consolidation project or treated as a standalone remediation task with its own timeline and budget line.
For residents and contractors who file permits or pull public records through the city's online portal at sf.gov, the practical effect of getting this right is faster document retrieval and fewer instances of being served an outdated or incorrectly labeled site photograph during a dispute. For the city, the stakes are largely fiscal and operational. The decisions made in Room 400 and inside the Department of Technology's offices on Seventh Street this summer will set the template for how San Francisco handles data hygiene across all its major systems—not just the ones currently flagged.