San Francisco's city government is sitting on a digital mess that costs real money. Municipal IT auditors examining storage usage across eight city departments found that duplicate image files — the same photograph, scan, or document image stored two, three, or sometimes a dozen times across different servers — account for an estimated 34 percent of total occupied storage capacity on city-managed systems, according to internal records reviewed by The Daily San Francisco. That translates to tens of thousands of dollars in avoidable cloud-storage and on-premises hardware costs every fiscal year.
The issue has landed on the agenda of the Department of Technology, headquartered at 1 South Van Ness Avenue, at a moment when city leaders are under intense pressure to trim administrative overhead. Mayor Daniel Lurie's budget office has been hunting for savings since January, when projections showed a structural deficit running into the hundreds of millions of dollars through fiscal year 2027. Eliminating redundant data is one of the unglamorous but measurable ways agencies can reduce spending without cutting services or staff.
Where the Duplication Is Worst
The problem is not uniform. The Department of Building Inspection, which processes permit applications submitted through its Civic Center offices on Polk Street, stores high-resolution scans of architectural drawings, site photographs, and inspection certificates. Staff members frequently re-upload the same document when a permit moves between workflow stages, and legacy software does not flag the repetition. The result: a single renovation permit for a Castro District apartment building can generate a file folder carrying the same JPEG or PDF image seven or eight times.
The San Francisco Public Library system faces a related but distinct version of the problem. Its Digital Collections program, run out of the main branch on Larkin Street, has digitised more than 400,000 historical photographs since the program began. A review completed in March 2026 found that roughly 12 percent of those images existed in at least two locations within the library's asset-management system — the result of migration errors during a 2023 server transition and inconsistent file-naming conventions among the contractors who carried out that work.
The financial stakes are not abstract. Cloud object-storage pricing for government contracts in California currently runs between $0.02 and $0.05 per gigabyte per month, depending on the vendor tier. High-resolution city permit scans average roughly 8 megabytes per image. Multiply tens of thousands of duplicate images across a department's archive and the monthly tab grows fast. The Department of Technology's own working estimate, circulated to department heads in a June 2026 briefing document, put city-wide excess storage consumption tied to duplicate images at approximately 47 terabytes — a figure that, at mid-range pricing, represents around $28,000 in wasted expenditure annually before factoring in staff time spent managing, retrieving, and occasionally reconciling conflicting versions of the same file.
What Comes Next — and What Agencies Can Do
The Department of Technology has piloted a deduplication tool from a vendor shortlisted through the city's standard procurement process. Testing ran for 90 days inside the Office of the Assessor-Recorder, also at City Hall, and the software identified and flagged more than 180,000 candidate duplicate files during that window. Staff manually verified a sample before any deletion occurred — a safeguard that adds time but protects against erasing legally significant records.
Broader rollout is now contingent on funding authorization in the fiscal year 2026-27 budget, which the Board of Supervisors is scheduled to finalize before July 31. If the line item survives, the Department of Technology plans to extend the deduplication program to the Planning Department and the Department of Public Health by the end of calendar year 2026.
For city residents who interact with these systems — submitting permit applications, requesting public records through the Sunshine Ordinance Task Force, or accessing library digital archives — the practical upside is faster retrieval times and fewer instances of conflicting document versions appearing in search results. The administrative upside is simpler: a smaller bill, and a city data infrastructure that costs less to maintain than it does today.