Catalogue pipeline
  • Introduction
  • Fetching records from source catalogues
    • What is an adapter?
    • CALM: Our archive catalogue
    • MIRO: Our image collections
  • Transforming records into a single, common model
    • Our single model: the Work
    • Creating canonical identifiers
  • Combining records from multiple sources
    • Why do we combine records?
    • How we choose which records to combine
  • Other topics
    • Catalogue
    • Search
      • wellcomecollection.org query development index
      • Hypotheses
        • Concepts, subjects
        • Contributors
        • Titles
        • Genres
        • Reference numbers
        • Synonymous names and subjects
        • Mood
        • Phrases
        • Concepts, subjects with other field
        • Contributor with other field
        • Title with other field
        • Genre with other field
        • Reference number with other field
        • Behaviours
        • Further research and design considerations
      • Analysis
        • Less than 3-word searches
        • Searches with 3 words or more
        • Subsequent searches
      • Query design
      • Relevance tests
        • Test 1 - Explicit feedback
        • Test 2 - Implicit feedback
        • Test 3 - Adding notes
        • Test 4 - AND or OR
        • Test 5 - Scoring Tiers
        • Test 6 - English tokeniser and Contributors
        • Test 7 - BoolBoosted vs ConstScore
        • Test 8 - BoolBoosted vs PhaserBeam
      • Collecting data
      • Reporting and metrics
      • Work IDs crib sheet
    • Adapters
      • Adapter lifecycle
      • Fetching records from Sierra
    • Sierra
      • Sierra IDs
    • Pipeline
      • Merging
    • APM
Powered by GitBook
On this page
  • 1. Linking works
  • How are works currently linked?
  • 2. Merging linked works
  • Items
  1. Other topics
  2. Pipeline

Merging

PreviousPipelineNextAPM

Last updated 2 years ago

Merging works occurs in 2 steps:

1. Linking works

Using features from the source data to calculate which works are linked.

e.g:

  • the BNumber from a Calm record

  • Marcfield 776$w linking a Sierra record to it's digitised counterpart

This is carried out by the transformers of the respective sources data.

How are works currently linked?

2. Merging linked works

Merging works consist of a few steps

  • Choose a field to be merged e.g. items

  • Choose a target - the work that we will merge into.

    e.g. Works from Calm

  • Choose the sources - works that will be merged into the target.

    e.g. Sierra works with a single item

Below are diagrams of the merging rules

Items

Calm

Sierra single item

Sierra multi item

View on Excalidraw
View on Excalidraw
View on Excalidraw
View on Excalidraw
How works are currently linked
How we merge items into calm
How we merge items into sierra single item
How we merge items into sierra multi item