📇
Catalogue API
  • Catalogue API
  • developers
  • How users request items
  • Search
    • Current queries
      • images
      • Query structure
    • Search
      • Changelog
      • Collecting data
      • Query design
      • Query design
      • wellcomecollection.org query development index
      • Reporting and metrics
      • Work IDs crib sheet
      • Analysis
        • Less than 3-word searches
        • Subsequent searches
        • Searches with 3 words or more
      • Hypotheses
        • Behaviours
        • Concepts, subject, and another field
        • Concepts, subjects with other field
        • Concepts, subjects
        • Contributor with other field
        • Contributors
        • Further research and design considerations
        • Genre with other field
        • Genres
        • Mood
        • Phrases
        • Reference number with other field
        • Reference numbers
        • Search scenarios
        • Synonymous names and subjects
        • Title with other field
        • Titles
      • Relevance tests
        • Test 1 - Explicit feedback
        • Test 2 - Implicit feedback
        • Test 3 - Adding notes
        • Test 4 - AND or OR
        • Test 5 - Scoring Tiers
        • Test 6 - English tokeniser and Contributors
        • Test 7 - BoolBoosted vs ConstScore
        • Test 8 - BoolBoosted vs PhaserBeam
    • Rank
      • Rank cluster
      • Developing with rank
      • Testing
Powered by GitBook
On this page
  • Setting up
  • Queries
  • Mappings and settings
  • Test cases
  1. Search
  2. Rank

Developing with rank

PreviousRank clusterNextTesting

Last updated 10 months ago

Setting up

  • If you need an up-to-date index, replicate one from the production cluster

  • Copy the query config across from the search API application: yarn copyQueries

Queries

Queries are the easiest part of the search-relevance puzzle to modify and test.

  • Make your changes to WorksMultiMatcherQuery.json or ImagesMultiMatcherQuery.json in public (these have been copied here by yarn copyQueries above).

  • Use the candidate queryEnv on /dev or /search to see the results.

  • When you're happy with the effect of your changes on the rank tests, you'll need to make the scala used by the API match the JSON used by rank. Edit the and/or scala files until pass.

Mappings and settings

We often want to test against indices that have new or altered analyzers, mappings, or settings. To create and populate a new index:

  • Run yarn getIndexConfig to fetch mappings and other config from existing indices in the rank cluster. The config for your chosen indices will be written to .

  • Edit the file(s) in data/indices to your needs, using existing mappings as a starting point.

  • Run yarn createIndex to create the new index in the rank cluster from the edited mappings. This will also give you an option to start a reindex.

  • If you need to monitor the state of a reindex, run yarn checkTask.

  • If you need to delete a candidate index, run yarn deleteIndex

  • If you need to update a candidate index, run yarn updateIndex

To see the results of your changes, select your new index on /dev or /search.

Test cases

We collect test cases directly from the stakeholders and feedback channels for wellcomecollection.org.

Each test should represent a search intention - a class of search which we see real users performing. For example

Tests should be grouped according to the following structure:

  • id, label, and description - describing what each group of cases is testing

  • eval - an optional, alternative evaluation method to apply to the metric score returned by elastic

  • searchTemplateAugmentation - an optional augmentation to the query, eg a filter

  • cases - the list of search terms and corresponding results to be tested

Each test case in that list should contain:

  • query - the search terms a researcher uses

  • ratings - IDs of documents that we want to evaluate against the results

  • description - a description of the search intention which is embodied by the test

You might need to edit the query to fit the new mapping, following .

Before deploying your changes, you'll need to make sure the scala version of the config used by the pipeline matches the JSON version you've been testing. You should copy your JSON config over to , and edit the scala until pass.

metric - an to run the cases against

as documented here
images
works
the tests
./data/indices/
the catalogue pipeline repo
the tests
Elasticsearch metric
these instructions