Catalogue pipeline
  • Introduction
  • Fetching records from source catalogues
    • What is an adapter?
    • CALM: Our archive catalogue
    • MIRO: Our image collections
  • Transforming records into a single, common model
    • Our single model: the Work
    • Creating canonical identifiers
  • Combining records from multiple sources
    • Why do we combine records?
    • How we choose which records to combine
  • Other topics
    • Catalogue
    • Search
      • wellcomecollection.org query development index
      • Hypotheses
        • Concepts, subjects
        • Contributors
        • Titles
        • Genres
        • Reference numbers
        • Synonymous names and subjects
        • Mood
        • Phrases
        • Concepts, subjects with other field
        • Contributor with other field
        • Title with other field
        • Genre with other field
        • Reference number with other field
        • Behaviours
        • Further research and design considerations
      • Analysis
        • Less than 3-word searches
        • Searches with 3 words or more
        • Subsequent searches
      • Query design
      • Relevance tests
        • Test 1 - Explicit feedback
        • Test 2 - Implicit feedback
        • Test 3 - Adding notes
        • Test 4 - AND or OR
        • Test 5 - Scoring Tiers
        • Test 6 - English tokeniser and Contributors
        • Test 7 - BoolBoosted vs ConstScore
        • Test 8 - BoolBoosted vs PhaserBeam
      • Collecting data
      • Reporting and metrics
      • Work IDs crib sheet
    • Adapters
      • Adapter lifecycle
      • Fetching records from Sierra
    • Sierra
      • Sierra IDs
    • Pipeline
      • Merging
    • APM
Powered by GitBook
On this page
  1. Other topics

APM

Some guidance for what you can do with the catalogue's APM (Application Performance Monitoring).

PreviousMerging

Last updated 2 years ago

Some things you can track with APM:

  • Errors (with nice stack traces). You can see an incident (elevated 500s rate) that we had . There's an alerting tool (watcher) which can take a Slack web hook - we should probably use this!

  • JVM stats - good for discovering memory leaks and also fun for garbage collection enthusiasts. .

The meat of APM is transaction monitoring: for us, that's monitoring the performance of endpoints. For example, for /works we can see

  • The average duration of a request is about 50ms

  • The 99th percentile is usually around 150ms, but is quite noisy

  • We have some outlier requests where it looks like the API stalls, and/or the network connection to elastic was very slow - might be worth looking into these!

All APM data is stored in Elastic and we can do our own analyses - for comparing the performance of aggregations and "normal" queries.

here
An example
quite a lot of data.
here's a dashboard