RFC 055: Content API

PreviousRFC 055: Genres as Concepts NextContent API: articles endpoint

Last updated 10 months ago

RFC 055: Content API

Status: Draft

Last updated: 2023-03-06

This RFC outlines a new set of API endpoints which will allow wellcomecollection.org users to search and filter content which is stored in Prismic.

Background

We use to edit and store information about our exhibitions, events, stories, and other pieces of non-catalogue content on wellcomecollection.org.

Recently, we've allowed users to using .

That MVP implementation has demonstrated that Prismic's search functionality isn't good enough to produce relevant results on its own. See Slack threads and .

Prismic matches documents to a user's search terms using a very loose, fuzzy query on all text-like fields, but, unlike Elasticsearch, does not assign each document a score corresponding to its relevance. Instead of sorting by relevance, users are limited to sorting the retrieved documents by date or by title, which often makes the results appear irrelevant (e.g. weak matches appearing at the top of the list due to recency). Prismic's GraphQL API is also unsuitable for filtering content by arbitrary fields, which further limits our users' ability to find the content they're looking for.

We'd like to replace our queries to the Prismic API with something more configurable, like the system we have for the catalogue.

We're building a pipeline which will ingest content from Prismic into a set of Elasticsearch indices (see ). To allow users to search and filter that content from Prismic, we also need a new set of API endpoints which will query those Elasticsearch indices. The priority purpose of these endpoints will be to serve our Search. We might use them at a later time for content list pages, but at this time the focus will solely be on making this useful for Search.

This API will live at https://api.wellcomecollection.org/content/v0/, with endpoints for /articles, /exhibitions, and /events.

We won't consider the way that documents are scored as part of this RFC. Relevance requirements should be developed iteratively and independently from the development of the API.

Purposes

The /content API should allow users to:

request a single exhibition, event, or article by ID;
query articles, exhibitions and events, retrieving relevant results based on their search terms. The focus for v0 will be on articles - the other two might be explored further in a future version of this API.
filter and aggregate list of articles by a set of predefined filters and aggregations - for v0 of the Content API, we will only use the query parameter for exhibitions and events

Further requirements

That being said, we will be following the Prismic content model in v0 over the Works model. Should that model not satisfy, we should consider making the changes in Prismic directly and adjusting the content.
The API should only return enough information for users to determine whether a result is relevant, and provide a link to the relevant page on wellcomecollection.org.
Even though we will be making [contentType]/[id] endpoints, the content of the pages themselves, and the content type list pages, should still be fetched from Prismic directly for the time being.
The API's URL structure should be consistent with what appears on wellcomecollection.org's front-end. For example, if article on the site appears at /articles/{id}, the API equivalent should be at content/v0/articles/{id}.

Notes on implementation

Though the content API will share code with the concepts API, it should be built as a separate service.
The Elasticsearch index mapping should represent the contract between the pipeline and the API. The API shouldn't need to know anything about the structure of the data in Prismic, and any substantial data augmentation should be done by the pipeline.

Proposed endpoints

Articles

https://api.wellcomecollection.org/content/v0/articles

Exhibitions

https://api.wellcomecollection.org/content/v0/exhibitions

Events

https://api.wellcomecollection.org/content/v0/events