Coventry's civic digital archive, maintained partly through the Herbert Art Gallery and Museum on Jordan Well, contains thousands of duplicate image files — some photographs catalogued and uploaded more than thirty times under different reference numbers — creating a backlog that volunteers and archivists have been quietly working through since late 2024.
The problem did not emerge overnight. It is the accumulated result of at least three separate digitisation drives over the past fifteen years, each conducted under different software platforms and with inconsistent naming conventions, meaning that no single database ever cleanly inherited the work of the one before it. Now, with Coventry City Council's Heritage and Culture Service undertaking a consolidation review, the scale of the duplication has become impossible to ignore.
Why does this matter now? Pressure has grown on local authorities across the West Midlands to make heritage collections properly searchable online, particularly after the National Lottery Heritage Fund set accessibility benchmarks as conditions attached to its grants from April 2025. Collections that cannot demonstrate clean, de-duplicated metadata risk losing eligibility for future funding rounds. For Coventry, a city still trading on its UK City of Culture 2021 legacy and building a longer-term cultural offer, that is a genuine financial vulnerability.
Three Digitisation Waves, One Persistent Problem
The first major digitisation effort began around 2009, when the Coventry History Centre — based at Mandela House on Bayley Lane — started scanning photographs from its physical collection using early-generation flatbed technology. Files were saved in formats that later proved incompatible with the content management system adopted in 2015. When staff migrated content to the newer platform, automated import tools created duplicate entries for any image that had been touched more than once during quality-checking.
A second round of uploads followed the City of Culture preparations in 2020 and 2021, when community groups, schools and neighbourhood associations from Hillfields, Earlsdon and Foleshill submitted their own photographic collections for inclusion. Volunteer coordinators working under the Coventry 2021 programme logged images using a separate spreadsheet system that was later merged — imperfectly — into the main archive. Archivists working on the consolidation review have described finding images of Broadgate and the old Owen Owen department store appearing under as many as forty distinct catalogue numbers.
The third and most recent duplication wave came from a 2023 partnership with a regional scanning contractor, brought in to process fragile glass-plate negatives held in climate-controlled storage. Invoicing records show the contract was valued at just over £47,000. The contractor delivered files in a batch structure that did not match the archive's existing folder taxonomy, and without a deduplication step built into the handover protocol, mirror copies were ingested alongside originals.
What a Fix Actually Looks Like
Deduplication is not simply a matter of deleting files. Archivists must verify that no two apparently identical images carry different provenance notes or annotation data before one version is retired. The Herbert's collections team, working alongside volunteers recruited through the Coventry and Warwickshire Record Society, began a phased review in November 2024. As of June 2026, roughly 11,000 image records had been assessed, with around 3,400 confirmed duplicates either merged or marked for removal.
The full collection is estimated to run to somewhere between 60,000 and 80,000 individual image records, meaning the current pace — a few hundred records processed per volunteer session, held fortnightly at the Herbert — puts completion several years away unless additional resource is found.
Anyone with a historical photograph of Coventry — particularly images of Spon Street, the Precinct or the Lower Precinct area before the 1990s redevelopments — is being encouraged to contact the Coventry History Centre directly before submitting digital files, to ensure new material enters the archive under a single, correctly tagged reference from the outset. The Heritage and Culture Service is also in early discussions with the University of Warwick's Centre for Arts, Memory and Communities about whether automated image-matching tools could accelerate the review. No agreement has been announced.
The lesson from Coventry's experience is straightforward: digitisation without a deduplication protocol built in from day one stores up problems that human hands must eventually untangle, one file at a time.