Catalogue pipeline
  • Introduction
  • Fetching records from source catalogues
    • What is an adapter?
    • CALM: Our archive catalogue
    • MIRO: Our image collections
  • Transforming records into a single, common model
    • Our single model: the Work
    • Creating canonical identifiers
  • Combining records from multiple sources
    • Why do we combine records?
    • How we choose which records to combine
  • Other topics
    • Catalogue
    • Search
      • wellcomecollection.org query development index
      • Hypotheses
        • Concepts, subjects
        • Contributors
        • Titles
        • Genres
        • Reference numbers
        • Synonymous names and subjects
        • Mood
        • Phrases
        • Concepts, subjects with other field
        • Contributor with other field
        • Title with other field
        • Genre with other field
        • Reference number with other field
        • Behaviours
        • Further research and design considerations
      • Analysis
        • Less than 3-word searches
        • Searches with 3 words or more
        • Subsequent searches
      • Query design
      • Relevance tests
        • Test 1 - Explicit feedback
        • Test 2 - Implicit feedback
        • Test 3 - Adding notes
        • Test 4 - AND or OR
        • Test 5 - Scoring Tiers
        • Test 6 - English tokeniser and Contributors
        • Test 7 - BoolBoosted vs ConstScore
        • Test 8 - BoolBoosted vs PhaserBeam
      • Collecting data
      • Reporting and metrics
      • Work IDs crib sheet
    • Adapters
      • Adapter lifecycle
      • Fetching records from Sierra
    • Sierra
      • Sierra IDs
    • Pipeline
      • Merging
    • APM
Powered by GitBook
On this page
  1. Combining records from multiple sources

Why do we combine records?

PreviousCreating canonical identifiersNextHow we choose which records to combine

Last updated 2 years ago

Sometimes information about an entity is split across multiple source catalogues.

For example, a box of papers in the archive might:

  • have an archive record in CALM (which is the "canonical" record)

  • have a library record in Sierra (so it can be )

  • have a digitised copy in METS (so it can be viewed online)

We combine these records into a single work so that all the information is on one page. This makes it easier for researchers to find everything we know about a work.

(There are instances in the past where somebody found the online record for a physical object, and travelled to see it in the library in-person, unaware that we had a digitised copy.)

viewed in the library