Content-api: next steps
Background information Challenges encountered and decisions made
Scheduled events
Delist
Sorting and filter by date
Events endpoint and response
Event endpoint and response
Filters
Aggregations
Sorting
Further requests for opinions and insights
Background information
Around October 2023, we started work on adding our Prismic Events to the Content API (see reasoning here for why we need the Content API). We did this without writing an RFC first, opting for a few chats and relying on the Original suggestion for Events that was written at the start of building the Content API. We realised this should not have been the case, and are now choosing to write one for a few reasons; to document the challenges we’ve come across, the decisions we've made along the way and how we propose it changes and ends up looking like.
The active endpoints are:
https://api.wellcomecollection.org/content/v0/events
https://api.wellcomecollection.org/content/v0/events/[eventId]
Document of interest:
Original suggestions for future versions of Content API (includes suggestions of filters and such for events)
Notes:
This document will be using the term “Scheduled” to talk about “children” events.
This API only serves Prismic
Events
content types. Please note that it doesn't fetch anything from theEvents Series
content type. The name of a series does get considered at the display level as it gets displayed on the result card along with a “Part of” prefix, but gets pulled from theEvent
data.Some of these challenges also affect the
Articles
endpoint. If that is the case, it will be indicated.
Challenges encountered and decisions made
Scheduled events
This Github PR described this and applied a solution that is subject to change.
Challenge
There is no way to create an event hierarchy at the moment. In reality, some events come as a set (they become “Scheduled” events) and could use such a thing. The solution adopted by the team is to create a “Parent” event first, which contains the general information including promotional images. Then, “Scheduled” events are added to its “Scheduled” tab:
These Scheduled events are, in reality, sibling events, as, as we said, there is no hierarchy within the Event content type. They are created only to be listed in the Parent event:
As they contain minimal other information (lacking images, description...), the Scheduled events are not to be seen individually, nor displayed in listing pages. This is done by tagging the event with delist.
The problem with search is that the Scheduled events contain important information such as date/time and access needs (interpretations). A Parent event would have no interpretation, but it might have a Scheduled event that is a BSL tour, and another one that is an Audio-described one.
The parent points to its children by ID, the child has no knowledge of what its parent is.
Decision
Scheduled events must be findable/filterable based on their interpretation, audience, etc... which are not set on the parent
The API returns the Parent event, not the Scheduled events to avoid duplication and/or evenst being listed without image or description.
Basically we want to be able to match on a Scheduled event but return its parent in the API response.
Therefore, we need the Parent event document to contain all the filterable values for its Scheduled events, at all levels - display, query, aggregrations. In order to achieve this, we modify the graphQuery to fetch the Scheduled event’s data alongside its Parent’s.
We also modify the event transformer to pull in the Scheduled event's format, interpretations, audiences and locations into the Parent and add them to the Parent's filterable and aggregatable values. We will mostly likely need to deduplicate as some Scheduled events have the same interpretations, locations, etc.
As a result, when a user filters by say, "Audio described" we may return a Parent event that is not audio described but has a Scheduled event that is.
Delist
Challenge
In the articles endpoint, we purposefully chose to allow articles that were tagged with delist
. In our current usage of the Prismic API, we hide those articles from list views. In the context of search, which the Content API was originally built for, the decision was made to make them available. It turns out this is not a blanket decision that can be made for all endpoints, as the delist
tag is used for different purposes across Prismic. In the Event content type, it is used to identify “Scheduled” events so the front-end can hide them, as they are not fit for display.
Decision
In the events endpoint, we chose to add isChildScheduledEvent: true
to events tagged with delist
. The ES query is designed to IGNORE these documents.
Sorting and filter by date
Challenges
Filtering by dates The Prismic API currently offers a date filter for fetching events. In the FE we use the logic below to determine which filter to use based on our needs. We will need something similar in Content API in order to match our current offering, as well as being able to be more efficient in the search. Currently, sorting by date doesn’t discriminate between past and future events. Therefore, you can either go from “farthest event in the past” to “farthest event in the future” or the opposite, which is very confusing and not very intuitive for people not expecting to be offered the whole catalogue of events.
Scheduled events dates Another issue is caused by Scheduled events. Their Parent event gets sorted based on the
startDateTime
of its first Scheduled event, which might be in the past. It might have other Scheduled events coming up, but that gets ignored, so it looks like there is no logic to the order.Granularity of dates In the current Proposal, we only offer filtering by date with
YYYY-MM-DD
, not with the time of the event. So an event that finished at 10 AM this morning would still be considered upcoming/current until the end of today. We could look at offering a more granular query parameter, but do we want to? What kind of filter do we want to offer our users? Date picker? Drop down with “Past/Today/This week/Future” options?
Decisions
Filtering by dates We will add a query filter that allows the FE to query based on
dates.to=
and/ordates.from=
that will take in a Date string (YYYY-MM-DD
).Scheduled events dates The issue with the Scheduled events will be dealt with by having an array of all the scheduled events time on the Parent. If any of the Scheduled events is happening within the filtered dates, the Parent event would display in the results.
Granularity of dates TBC. See ticket.
Proposal
Events endpoint and response
https://api.wellcomecollection.org/content/v0/events?aggregations=format%2Caudiences%2CisAvailableOnline%2Clocations.attendance%2Cinterpretations.label
See response by consulting events-api-response.json.
Changes are in the aggregations
part of the reponse:
Event endpoint and response
https://api.wellcomecollection.org/content/v0/events/[eventId]
See response by consulting event-api-response.json.
No change to the API response.
Filters
This applies both for Events and Articles; we want to align them on API faceting principles & expectations (there is a ticket for this):
Filters are named by the JSON paths of the identified object that they filter or, if applied to an attribute other than the identifier, the path of that attribute.
Aggregations are always paired with identically named filters.
We are proposing an alignment to the filter names, paired aggregations and corresponding query params with the identified object that they filter, see values in "Identified object that they filter" column below.
Events
formatId
format
format
format
audienceIds
audiences
audiences
audience
locationIds
locations.attendance
locations
location
isAvailableOnline
isAvailableOnline
isAvailableOnline
isAvailableOnline
Articles
formatId
format
format
format
contributorsIds
contributors.contributor
contributors
contributors.contributor
publicationDate
publicationDate
no aggregation
publicationDate.from/to
Aggregations
Events
See "Solve duplicate aggregation buckets/filter options (interpretations)" ticket for context.
We want to use labels instead of ids for aggregating interpretations. That means that the (paired/matching) filter needs to be “interpretations.label”: [“Audio described”, "Speech-to-text"]
and the aggregation:
Align
aggregatableValues
in index with their paired filter
Change aggregatableValues.locations
to aggregatableValues.locations.attendance
The other aggregatableValues
are already aligned with the identified object they aggregate on.
Sorting
See challenges for more information on date sorting
Options for sorting:
times.startDateTime asc or desc
relevance, scored on
id
title (x100)
captions (x10)
Currently, the “relevance” for an event’s query is only scored based on id
, title
and caption
. If no query keyword is entered, the tie-breaker is startDateTime
in descending
order.
We suggest adding these field to calculate the relevance score:
format
audiences
series
Last updated