📇
Catalogue API
  • Catalogue API
  • developers
  • How users request items
  • Search
    • Current queries
      • images
      • Query structure
    • Search
      • Changelog
      • Collecting data
      • Query design
      • Query design
      • wellcomecollection.org query development index
      • Reporting and metrics
      • Work IDs crib sheet
      • Analysis
        • Less than 3-word searches
        • Subsequent searches
        • Searches with 3 words or more
      • Hypotheses
        • Behaviours
        • Concepts, subject, and another field
        • Concepts, subjects with other field
        • Concepts, subjects
        • Contributor with other field
        • Contributors
        • Further research and design considerations
        • Genre with other field
        • Genres
        • Mood
        • Phrases
        • Reference number with other field
        • Reference numbers
        • Search scenarios
        • Synonymous names and subjects
        • Title with other field
        • Titles
      • Relevance tests
        • Test 1 - Explicit feedback
        • Test 2 - Implicit feedback
        • Test 3 - Adding notes
        • Test 4 - AND or OR
        • Test 5 - Scoring Tiers
        • Test 6 - English tokeniser and Contributors
        • Test 7 - BoolBoosted vs ConstScore
        • Test 8 - BoolBoosted vs PhaserBeam
    • Rank
      • Rank cluster
      • Developing with rank
      • Testing
Powered by GitBook
On this page
  • Candidates
  • Results
  • Click through rate
  • Click distribution
  • Conclusions
  1. Search
  2. Search
  3. Relevance tests

Test 6 - English tokeniser and Contributors

PreviousTest 5 - Scoring TiersNextTest 7 - BoolBoosted vs ConstScore

Last updated 10 months ago

Candidates

A variant of the scoring tiers query introduced in Test 5 was tested. The variant adds an english analyser (to capture peculiarities of apostrophes in queries like gray's anatomy), and includes contributors in the list of queried fields.

rough form of a scoring tiers query

Results

The test ran for three weeks over the Christmas break. It's worth bearing in mind that the audience we're serving is likely to be less discerning and research-focused if our researcher audience are also less active over these weeks, but the size of the dataset (~9,000 sessions, 100,000 events) should still give us reliable indication of goodness.

Click through rate

default scoring tiers query
variant

first page only

0.238

0.220

beyond first page

0.564

0.533

Click distribution

Conclusions

The data indicates that the new variant query performs worse than the default, particularly at the top end of the list of results (where worse performance is indicated by fewer clicks).

The difference isn't substantial, and both queries seem to perform better than some of our previous candidates. However, we should still step back and evaluate why performance is lower before proposing a new set of changes.

download-1