Coventry City Council's digital heritage team has been quietly working through a backlog of duplicate images lodged within the Herbert Art Gallery and Museum's online catalogue, a problem that first surfaced formally during the institution's post-2021 City of Culture digitisation push. The issue is neither trivial nor cosmetic: duplicated records inflate search results, skew usage statistics that funders rely on, and frustrate researchers at places like Coventry University's Centre for Arts, Memory and Communities who depend on clean, navigable archives.
The timing matters because the Herbert's digitisation programme, which accelerated sharply after Coventry's UK City of Culture year in 2021, added tens of thousands of records to its public-facing portal in a compressed window. Rapid ingestion at speed almost always generates duplication — identical or near-identical image files that arrive through multiple scanning batches, donor submissions or legacy database migrations. Institutions that hosted smaller, slower digitisation efforts largely avoided the bottleneck. Coventry, by contrast, had strong cultural and political reasons to move fast, and is now dealing with the arithmetic consequence.
Where Coventry Sits in a Global Comparison
Amsterdam's Rijksmuseum completed a comparable deduplication exercise across its Rijksstudio platform between 2022 and 2024, processing roughly 900,000 open-access images. The Dutch institution deployed automated perceptual hashing — a technique that generates a fingerprint for each image and flags near-identical pairs — reducing manual review time by an estimated 60 percent according to published case notes from the Europeana Foundation's 2024 digital collections conference. Bristol Museum and Art Gallery, which faced a parallel problem after integrating records from the M Shed on Princes Wharf into a shared West of England collections system, took a slower, more manual route and publicly acknowledged a 14-month delay in completing the clean-up, finishing in early 2025. Montréal's Musée d'art contemporain used a hybrid model in 2023 that combined algorithmic detection with volunteer-assisted review sessions held at the museum itself.
Coventry has not yet published a formal completion timeline. The Herbert, on Jordan Well in the city centre, has confirmed to The Daily Coventry that deduplication work is ongoing, though the institution has not provided specific figures on how many records are affected or what software tooling is in use. Coventry City Council's Digital and Customer Services directorate, which supports the broader infrastructure, pointed to its Digital Coventry 2025–2028 strategy as the framework governing data quality work across council-held collections, but that document does not contain specific targets for heritage image deduplication.
At Coventry University's Lanchester Library on Frederick Lanchester Way, staff supporting the university's institutional repository have dealt with a related but distinct version of the problem — duplicate submissions of research images and figures across thesis deposits. The university adopted the VIVO open-source research information system in 2023, which includes basic deduplication flags, though library staff have noted the tool works better on metadata than on raw image files.
What the Gap in Pace Means Practically
The comparison with Amsterdam is instructive precisely because the Rijksmuseum had a budget and technical infrastructure that most UK regional institutions cannot match. Bristol's slower, manual approach is probably the more honest comparator for a city the size of Coventry. What Bristol's experience showed is that without a defined project budget ring-fenced for the work, deduplication gets treated as a residual task, absorbing collections staff time that would otherwise go to new acquisitions cataloguing or public engagement.
For anyone using the Herbert's online catalogue today — accessible via the Coventry City Council website — the practical advice is straightforward: if a search returns what appear to be identical results for the same photograph or object, it is worth clicking through to both records and using the feedback function to flag the duplication. The Herbert has indicated this user-reporting mechanism feeds directly into the review queue. Researchers with time-sensitive projects are advised to contact the Herbert's collections team directly at Jordan Well to confirm whether a specific record has been quality-checked, particularly before citing digital catalogue entries in published work. The institution expects to have a clearer public update on progress by the end of the third quarter of 2026.