Request For Comments (RFCs)
  • Request for comments (RFC)
  • RFC 001: Matcher architecture
  • RFC 002: Archival Storage Service
  • RFC 003: Asset Access
  • RFC 004: METS Adapter
  • RFC 005: Reporting Pipeline
  • RFC 006: Reindexer architecture
  • RFC 007: Goobi Upload
  • RFC 008: API Filtering
  • RFC 009: AWS account setup
  • RFC 010: Data model
  • RFC 011: Network Architecture
  • RFC 012: API Architecture
  • RFC 013: Release & Deployment tracking
    • Deployment example
    • Version 1
  • RFC 014: Born digital workflow
  • RFC 015: How we work
    • Code Reviews
    • Shared Libraries
  • RFC 016: Holdings service
  • RFC 017: URL Design
  • RFC 018: Pipeline Tracing
  • RFC 019: Platform Reliability
    • CI/CD
    • Observability
    • Reliability
  • RFC 020: Locations and requesting
  • RFC 021: Data science in the pipeline
  • RFC 022: Logging
    • Logging example
  • RFC 023: Images endpoint
  • RFC 024: Library management
  • RFC 025: Tagging our Terraform resources
  • RFC 026: Relevance reporting service
  • RFC 026: Relation Embedder
  • RFC 027: Pipeline Intermediate Storage
  • RFC 029: Work state modelling
  • RFC 030: Pipeline merging
  • RFC 031: Relation Batcher
  • RFC 032: Calm deletion watcher
  • RFC 033: Api internal model versioning
  • RFC 034: Modelling Locations in the Catalogue API
  • RFC 035: Modelling MARC 856 "web linking entry"
  • RFC 036: Modelling holdings records
  • RFC 037: API faceting principles & expectations
  • RFC 038: Matcher versioning
  • RFC 039: Requesting API design
  • RFC 040: TEI Adapter
  • RFC 041: Tracking changes to the Miro data
  • RFC 042: Requesting model
  • RFC 043: Removing deleted records from (re)indexes
  • RFC 044: Tracking Patron Deletions
  • RFC 045: Work relationships in Sierra, part 2
    • Work relationships in Sierra
  • RFC 046: Born Digital in IIIF
  • RFC 047: Changing the structure of the Catalogue API index
  • RFC 048: Concepts work plan
  • RFC 049: Changing how aggregations are retrieved by the Catalogue API
  • RFC 050: Design considerations for the concepts API
  • 051-concepts-adapters
  • RFC 052: The Concepts Pipeline - phase one
  • RFC 053: Logging in Lambdas
  • RFC 054: Authoritative ids with multiple Canonical ids.
  • RFC 055: Genres as Concepts
  • RFC 056: Prismic to Elasticsearch ETL pipeline
  • RFC 058: Relevance testing
    • Examples of rank CLI usage
  • RFC 059: Splitting the catalogue pipeline Terraform
  • RFC 060: Service health-check principles
  • RFC 061: Content API next steps
  • RFC 062: Content API: All search and indexing of addressable content types
  • RFC 062: Wellcome Collection Graph overview and next steps
  • RFC 063: Catalogue Pipeline services from ECS to Lambda
  • RFC 064: Graph data model
  • RFC 065: Library Data Link Explorer
  • RFC 066: Catalogue Graph pipeline
  • RFC 067: Prismic API ID casing
  • RFC 068: Exhibitions in Content API
  • RFC 069: Catalogue Graph Ingestor
  • RFC 070: Concepts API changes
  • RFC 071: Python Building and Deployment
    • The current state
  • RFC 072: Transitive Sierra hierarchies
  • RFC 073: Content API
    • Content API: articles endpoint
    • Content API: Events endpoint
    • Content API: exhibitions endpoint
    • The future of this endpoint
  • RFC 074: Offsite requesting
    • Sierra locations in the Catalogue API
  • RFC 075: Using Apache Iceberg tables in Catalogue Pipeline adapters
Powered by GitBook
On this page
  • Glossary
  • Summary
  • Motivation
  • Use cases
  • Proposal
  • Collection search-terms
  • Implementation
  • Pseudo λ
  • Getting started
  • Alternatives
  • Questions

RFC 026: Relevance reporting service

This RFC describes a proposal for a service that will allow us to test and report on the efficacy of our elastic-queries by comparing a set of search-terms and their respective expected results and ordering.

Last modified: 2020-07-20T12:18:54+01:00

Glossary

  • elastic-query: A blob of JSON we send to Elasticsearch

  • search-terms: A set of terms sent as a string from a client to a search service

Summary

A way for us to be able to test and report on the efficacy of elastic-queries by being able to compare a set of search-terms and their respective expected results and ordering.

Motivation

We need to have confidence and instil transparency in the way that our search relevancy works.

We should test using the same data people are seeing.

Use cases

  • Developers and data-scientists being able to write new search queries with confidence in improvement and without fear of regression

  • Allowing internal teams to have a greater understanding and more input as to how the search works, thus being able to share this with external researchers

  • External researchers interested in how their results are being returned

Proposal

  1. Creating a public API where you can send elastic-queries

  2. This query will be run against a set of predefined search-terms, which will also have a preset set of relevant documents which should match

  3. We will report on the response of rank_eval response

To create the rank_eval test we will

  • Use the current research to fill in the search-queries and relevant documents

  • Release this to internal and external people

  • Use feedback from internal people to fill out the rank_eval tests

  • Repeat

Collection search-terms

@alicerichmond is going through some rounds of qualitative research with internal people first, and working with them to go through some searches that they know should return certain results. There are a few documents about which we will distil into this * .

* Documents to distil

  • https://docs.google.com/document/d/1aIbE4IAOZb1Cbyp9ei81HtZ5w98AVZHIrudP9RUG1Is/edit#heading=h.byr3zbqes7qg

  • https://wellcomecloud-my.sharepoint.com/:w:/g/personal/a_richmond_wellcome_ac_uk/EZ-5bvoQ76NCkIBDok_L0c0BaqsaFKFLN-J9KtnEprUsKw?e=lyzSmO

  • https://app.gitbook.com/@wellcomecollection/s/catalogue/search/intentions-and-expectations

Implementation

The relevance service will be a lambda that can be triggered by services that have IAM permission to do so.

The service can return a fail / pass as a response to the call, being useful for CI etc, and can store that report in S3 for dashboards.

Pseudo λ

const http = require('http');
const {
    esUrl
} = process.env;
/*
  id: string
  query:
  {
    "match": {
        "data.title": "origin of species"
    }
  }
*/
const {
    id,
    query
} = JSON.parse(event.body);

// There should be loads more of these, and the queries would be more 
const requests = [{
    "id": "title_origin_of_species",
    "request": {
        "query": query
    },
    "ratings": [{
        "_index": "works_prod"
        "_id": "awfa9aty",
        "rating": 0
    }]
}]

const response = await http.get( `${esUrl}/works_prod/_rank_eval` , {
    json: true,
    body: {
        requests
    }
})

const report = parseEsRankEvalResponse(response)

await saveReport(report)

return report

Getting started

We have a set of work creating queries that return known works by their title in progress.

This will be a good place to start as the terms and relevant documents should be easy enough to get.

Alternatives

Currently we use a set of unit tests which don't pick up on the nuances of our actual metadata, nor account for the data changing over time by either new transformations, new sources, or just ongoing updates from the different content management systems.

We could write a set of queries and test agaist the HTTP API. We would then need to write our own reporting and way of scanning returned documents. While this is definitely an option, it feels better to explore the tool that has been created for this rather than write our own.

Questions

  • We're not sure how brittle this might make our tests, so not sure how tightly to integratet it in CI/CD

  • Where exactly in the tests does this service sit? e.g. a new pipeline is spun up, how will we ensure running this is part of the switch to that pipeline

PreviousRFC 025: Tagging our Terraform resourcesNextRFC 026: Relation Embedder

Last updated 10 days ago

This will be run against Elastic's api

@taceybadgerbrook is also working on analysing the quantitative search queries to help come up with this.

rank_eval
e.g.
architecture