San Francisco's city government is sitting on a digital hoarding problem. Across departments from the Municipal Transportation Agency to the Department of Public Works, database audits completed in the first half of 2026 found that duplicate image files account for between 30 and 45 percent of total storage on shared government servers — redundant photographs, scanned permits, and duplicated inspection records that clog workflows and inflate cloud storage bills, according to internal capacity reports reviewed as part of ongoing budget discussions at City Hall.
The timing matters. San Francisco is finalizing a two-year budget cycle under Mayor Daniel Lurie that has forced cuts across multiple departments, and IT procurement is getting harder scrutiny than it has in years. Digital asset management — once treated as back-office housekeeping — is now a line item that budget analysts on the Board of Supervisors Finance Committee are actively questioning. With cloud storage costs rising roughly 12 percent annually for enterprise municipal contracts, the math on unmanaged duplication is getting ugly fast.
Where the Clutter Lives
The problem concentrates in a handful of agencies. SF Environment, headquartered on Sutter Street, manages a public-facing image library for outreach campaigns that ballooned to more than 180,000 files between 2021 and 2025, a figure cited in a departmental efficiency review circulated to staff earlier this year. Roughly 38,000 of those files were flagged as probable or confirmed duplicates. The San Francisco Planning Department, which processes thousands of permit applications annually from its offices at 49 South Van Ness Avenue, stores mandatory project photographs that accumulate duplicates every time an application is revised and resubmitted without the original being purged.
The city's Digital Services office, which operates under the umbrella of the Department of Technology on Turk Street, has been piloting a deduplication tool since January 2026 across three agencies. The tool uses perceptual hashing — a technique that identifies visually identical or near-identical images regardless of file name or metadata — rather than simple byte-for-byte comparison. That distinction matters because a photograph exported at two different resolutions would fool a basic duplicate checker but gets caught by perceptual hashing. Early results from the pilot showed a 27 percent reduction in active image storage across the three participating departments within 90 days of deployment.
What Deduplication Actually Costs — and Saves
Storage is only part of the equation. The harder cost is staff time. When city employees searching an asset library pull up 14 versions of the same Mission District streetscape photograph to find the one with the correct date stamp, that is not a trivial inefficiency. A 2025 survey by the National Association of Government Archivists found that municipal employees in mid-to-large U.S. cities spend an average of 4.5 hours per week navigating redundant or poorly tagged digital assets — time that translates directly into salary expense.
At San Francisco salary scales for mid-level administrative staff, 4.5 hours weekly across even 200 affected employees works out to a conservative $3.2 million in annual lost productivity, a figure consistent with estimates that have circulated among city IT planners. The cloud storage savings from a full deduplication push across all major departments are estimated at between $400,000 and $700,000 per year, depending on contract renegotiation leverage — smaller than the productivity number, but immediate and auditable.
The Digital Services pilot is scheduled to expand to six additional departments by October 2026, including SF Public Works and the Recreation and Parks Department, which manages image archives for more than 220 parks citywide. Staff in the Controller's Office are expected to present a cost-benefit analysis to the Finance Committee before the end of Q3. If the numbers hold from the pilot, the recommendation will almost certainly be to fund a citywide rollout. Departments not yet in the program can prepare now by auditing their largest shared drives, establishing a single-source upload policy, and tagging images at the point of intake — basic hygiene that costs nothing but pays off significantly once automated deduplication tools arrive.