Catalogue pipeline
  • Introduction
  • Fetching records from source catalogues
    • What is an adapter?
    • CALM: Our archive catalogue
    • MIRO: Our image collections
  • Transforming records into a single, common model
    • Our single model: the Work
    • Creating canonical identifiers
  • Combining records from multiple sources
    • Why do we combine records?
    • How we choose which records to combine
  • Other topics
    • Catalogue
    • Search
      • wellcomecollection.org query development index
      • Hypotheses
        • Concepts, subjects
        • Contributors
        • Titles
        • Genres
        • Reference numbers
        • Synonymous names and subjects
        • Mood
        • Phrases
        • Concepts, subjects with other field
        • Contributor with other field
        • Title with other field
        • Genre with other field
        • Reference number with other field
        • Behaviours
        • Further research and design considerations
      • Analysis
        • Less than 3-word searches
        • Searches with 3 words or more
        • Subsequent searches
      • Query design
      • Relevance tests
        • Test 1 - Explicit feedback
        • Test 2 - Implicit feedback
        • Test 3 - Adding notes
        • Test 4 - AND or OR
        • Test 5 - Scoring Tiers
        • Test 6 - English tokeniser and Contributors
        • Test 7 - BoolBoosted vs ConstScore
        • Test 8 - BoolBoosted vs PhaserBeam
      • Collecting data
      • Reporting and metrics
      • Work IDs crib sheet
    • Adapters
      • Adapter lifecycle
      • Fetching records from Sierra
    • Sierra
      • Sierra IDs
    • Pipeline
      • Merging
    • APM
Powered by GitBook
On this page
  • What have we ingested?
  • Filling in some gaps
  1. Other topics
  2. Adapters

Fetching records from Sierra

In general, the Sierra adapter stack will keep us up to date with new records from Sierra, and it shouldn't need any manual intervention.

If you do need to fill in some gaps manually, this document explains how to do it.

What have we ingested?

If you want to see how many records we've already pulled in, there's a script that reports the current adapter progress:

$ python sierra_adapter/report_adapter_progress.py

===============================================================================
bibs windows
===============================================================================
2003-05-01T00:00:00 -- 2018-03-07T16:31:21.573345


===============================================================================
items windows
===============================================================================
1999-11-01T00:01:00 -- 2018-03-07T16:28:51.955944

These represent a complete run, through to the very earliest bibs and items we need to pull in.

Filling in some gaps

If the script above shows gaps in the data, you can regenerate the windows for the Sierra reader with a second script.

For example:

$ python sierra_adapter/build_windows.py \
  --interval=5 \
  --resource=items \
  --start='2018-02-02T16:27' \
  --end='2018-02-02T16:44'

Here "interval" is measured in minutes. Keep running this script to create new windows until you're done!

PreviousAdapter lifecycleNextSierra

Last updated 2 years ago