Request For Comments (RFCs)
  • Request for comments (RFC)
  • RFC 001: Matcher architecture
  • RFC 002: Archival Storage Service
  • RFC 003: Asset Access
  • RFC 004: METS Adapter
  • RFC 005: Reporting Pipeline
  • RFC 006: Reindexer architecture
  • RFC 007: Goobi Upload
  • RFC 008: API Filtering
  • RFC 009: AWS account setup
  • RFC 010: Data model
  • RFC 011: Network Architecture
  • RFC 012: API Architecture
  • RFC 013: Release & Deployment tracking
    • Deployment example
    • Version 1
  • RFC 014: Born digital workflow
  • RFC 015: How we work
    • Code Reviews
    • Shared Libraries
  • RFC 016: Holdings service
  • RFC 017: URL Design
  • RFC 018: Pipeline Tracing
  • RFC 019: Platform Reliability
    • CI/CD
    • Observability
    • Reliability
  • RFC 020: Locations and requesting
  • RFC 021: Data science in the pipeline
  • RFC 022: Logging
    • Logging example
  • RFC 023: Images endpoint
  • RFC 024: Library management
  • RFC 025: Tagging our Terraform resources
  • RFC 026: Relevance reporting service
  • RFC 026: Relation Embedder
  • RFC 027: Pipeline Intermediate Storage
  • RFC 029: Work state modelling
  • RFC 030: Pipeline merging
  • RFC 031: Relation Batcher
  • RFC 032: Calm deletion watcher
  • RFC 033: Api internal model versioning
  • RFC 034: Modelling Locations in the Catalogue API
  • RFC 035: Modelling MARC 856 "web linking entry"
  • RFC 036: Modelling holdings records
  • RFC 037: API faceting principles & expectations
  • RFC 038: Matcher versioning
  • RFC 039: Requesting API design
  • RFC 040: TEI Adapter
  • RFC 041: Tracking changes to the Miro data
  • RFC 042: Requesting model
  • RFC 043: Removing deleted records from (re)indexes
  • RFC 044: Tracking Patron Deletions
  • RFC 045: Work relationships in Sierra, part 2
    • Work relationships in Sierra
  • RFC 046: Born Digital in IIIF
  • RFC 047: Changing the structure of the Catalogue API index
  • RFC 048: Concepts work plan
  • RFC 049: Changing how aggregations are retrieved by the Catalogue API
  • RFC 050: Design considerations for the concepts API
  • 051-concepts-adapters
  • RFC 052: The Concepts Pipeline - phase one
  • RFC 053: Logging in Lambdas
  • RFC 054: Authoritative ids with multiple Canonical ids.
  • RFC 055: Genres as Concepts
  • RFC 056: Prismic to Elasticsearch ETL pipeline
  • RFC 058: Relevance testing
    • Examples of rank CLI usage
  • RFC 059: Splitting the catalogue pipeline Terraform
  • RFC 060: Service health-check principles
  • RFC 061: Content API next steps
  • RFC 062: Content API: All search and indexing of addressable content types
  • RFC 062: Wellcome Collection Graph overview and next steps
  • RFC 063: Catalogue Pipeline services from ECS to Lambda
  • RFC 064: Graph data model
  • RFC 065: Library Data Link Explorer
  • RFC 066: Catalogue Graph pipeline
  • RFC 067: Prismic API ID casing
  • RFC 068: Exhibitions in Content API
  • RFC 069: Catalogue Graph Ingestor
  • RFC 070: Concepts API changes
  • RFC 071: Python Building and Deployment
    • The current state
  • RFC 072: Transitive Sierra hierarchies
  • RFC 073: Content API
    • Content API: articles endpoint
    • Content API: Events endpoint
    • Content API: exhibitions endpoint
    • The future of this endpoint
  • RFC 074: Offsite requesting
    • Sierra locations in the Catalogue API
Powered by GitBook
On this page
  • Table of contents
  • What is an RFC?
  • How do I format an RFC?
  • RFC Listing

Request for comments (RFC)

NextRFC 001: Matcher architecture

Last updated 1 day ago

An RFC is a place to discuss possible changes to the Wellcome Collection platform.

Table of contents

What is an RFC?

Please create an RFC if you have an idea about how to make a big change to the way we do things currently and need a place to share that with your colleagues.

The process of creating an RFC, discussing that RFC in a pull request, amending and merging is important to provide a forum for all to contribute to the platform.

When an RFC is merged it provides a guide to implementing that change when it is useful to do so, or provides context to an .

How do I format an RFC?

An RFC is a markdown file in the rfcs directory. It should be named with a number and a short description, e.g. 070-concepts-api-changes.md. The filename should be prefixed with the next available number in the sequence, and the title of the RFC should match the filename.

The RFC must include the following sections:

  • Title: A short, descriptive title for the RFC, in the format RFC {number}: {title}.

  • Last modified: The date and time the RFC was last modified, in ISO 8601 format (use date -u +"%Y-%m-%dT%H:%M:%SZ")

  • Context: A brief description of the problem or opportunity that the RFC addresses.

The RFC should include the following sections:

  • Proposal: A detailed description of the proposed solution, including any relevant technical details, diagrams, or examples.

  • Alternatives considered: A discussion of any alternative solutions that were considered, and why they were not chosen.

  • Impact: A description of the impact of the proposed solution, including any potential risks or challenges.

  • Next steps: A list of next steps for implementing the proposed solution, including any dependencies or prerequisites.

RFC Listing

This is generated from the RFCs in this directory using .script/create_table_summary.py.

RFC ID
Title
Summary
Last Modified

RFC 070: Concepts API changes

This RFC describes changes to the concepts API, which will be used to support new theme pages on the Wellcome Collection website.

Last modified: 2025-03-13T18:14:09+00:00

RFC 071: Python Building and Deployment

Building and deploying Python projects

Last modified: 2025-03-13T18:14:09+00:00

RFC 068: Exhibitions in Content API

Exhibitions are to be added to Events search, becoming Events & Exhibitions search. We'll therefore be working on indexing Exhibitions in a more intentional manner. That indexing and subsquent API endpoint will power the Events & Exhibitions search as well as, eventually, the existing listing pages.

Last modified: 2025-02-18T17:15:49+00:00

RFC 069: Catalogue Graph Ingestor

Last modified: 2025-02-18T14:31:18+00:00

RFC 067: Prismic API ID casing

This RFC proposes a consistent casing for Prismic API IDs across custom types, fields, and slices, to align with Prismic defaults and improve maintainability.

Last modified: 2025-01-13T12:28:03+00:00

RFC 066: Catalogue Graph pipeline

This RFC outlines considerations for the development of the catalogue-graph pipeline. The first iteration of the graph will be focused on concepts and their enrichment with data from external ontologies, as discussed below.

Last modified: 2025-01-08T15:36:09+00:00

RFC 064: Graph data model

Last modified: 2024-12-05T16:31:45+00:00

RFC 065: Library Data Link Explorer

This RFC outlines the plan for the Library Data Link Explorer web application. This tool will enable Collections Information colleagues the ability to view and debug work relationships independently, potentially replacing the workflow of requesting a developer-run script to produce a matcher graph .dot file.

Last modified: 2024-11-27T17:37:23+00:00

RFC 062: Content API: All search and indexing of addressable content types

Searching for content on wellcomecollection.org is currently split into separate, statically-ordered grids for Stories, Works, Images and Events. This RFC proposes a new "All" search endpoint that will return all Addressable content types in a single, ordered list, improving efficiency and relevance.

Last modified: 2024-11-18T10:11:11+00:00

RFC 063: Catalogue Pipeline services from ECS to Lambda

Discuss the potential benefits and challenges of moving the catalogue pipeline services from AWS Elastic Container Service (ECS) to AWS Lambda.

Last modified: 2024-10-25T10:16:40+01:00

RFC 062: Wellcome Collection Graph overview and next steps

Enriching concepts in the Wellcome Collection with a knowledge graph to enhance discovery and exploration of the collection online.

Last modified: 2024-10-11T12:15:29+01:00

RFC 061: Content API next steps

This RFC documents the next steps for the Content API, specifically focusing on the addition of Prismic Events to the API. It outlines the background information, challenges encountered, decisions made, and the proposal for how the API will be structured moving forward.

Last modified: 2024-07-03T16:37:28+01:00

RFC 074: Offsite requesting

This RFC outlines the plan for enabling online requesting of items that are held offsite, with a phased approach to accommodate both onsite and offsite viewing.

Last modified: 2024-04-23T11:57:13+01:00

RFC 060: Service health-check principles

This RFC explores how we should implement health-checks in our services, specifically around services that have HTTP interactions / are serviced by load-balancers that implement health-checking.

Last modified: 2024-02-07T15:40:51+00:00

RFC 059: Splitting the catalogue pipeline Terraform

This RFC proposes a change to how we manage the Terraform for instances of the catalogue pipeline.

Last modified: 2023-07-03T10:39:33+01:00

RFC 058: Relevance testing

This RFC describes how and why we might write a new version of rank, our relevance testing tool.

Last modified: 2023-06-20T14:04:56+01:00

RFC 073: Content API

This RFC outlines a new set of API endpoints which will allow wellcomecollection.org users to search and filter content which is stored in Prismic.

Last modified: 2023-03-08T12:09:12+00:00

RFC 055: Genres as Concepts

This RFC proposes to treat Genres as Concepts, in the same manner as Subjects.

Last modified: 2023-03-06T15:49:16+00:00

RFC 056: Prismic to Elasticsearch ETL pipeline

This RFC proposes a mechanism for extracting data from Prismic, transforming it, and loading it into Elasticsearch to make our editorial content more discoverable via an API.

Last modified: 2023-03-02T11:39:12+00:00

RFC 054: Authoritative ids with multiple Canonical ids.

This RFC proposes a change to the way Concepts are stored in the catalogue-concepts index

Last modified: 2023-02-10T11:51:49+00:00

RFC 017: URL Design

This RFC proposes a set of principles for designing URLs on wellcomecollection.org, ensuring they are persistent, user-friendly, and globally unique.

Last modified: 2022-12-09T15:25:52+00:00

RFC 053: Logging in Lambdas

This RFC proposes a solution for logging in AWS Lambdas, aiming to provide a consistent and efficient way to capture and stream logs from Lambda functions to an Elasticsearch cluster.

Last modified: 2022-11-30T15:57:23+00:00

RFC 051: Ingesting Library of Congress concepts

This RFC outlines the design for the first phase of the concepts pipeline, specifically focusing on ingesting concepts from the Library of Congress (LoC) and preparing them for use in the Wellcome Collection catalogue.

Last modified: 2022-07-08T10:08:48+01:00

RFC 052: The Concepts Pipeline - phase one

This RFC describes the first phase of the Concepts Pipeline, which will be used to ingest and aggregate concepts.

Last modified: 2022-07-07T12:03:29+01:00

RFC 050: Design considerations for the concepts API

This RFC collects some initial thinking on how we might represent concepts in the catalogue API. It's a starting point for discussions; not a final design.

Last modified: 2022-05-31T10:06:03+01:00

RFC 049: Changing how aggregations are retrieved by the Catalogue API

This RFC proposes a change to how aggregations are handled in the Catalogue API, allowing us to remove the internal/display model coupling that currently exists.

Last modified: 2022-05-13T12:43:30+01:00

RFC 048: Concepts work plan

This RFC outlines the work plan for introducing concepts to the Wellcome digital platform, including the design of a concepts API, knowledge graph population, and integration with works.

Last modified: 2022-05-10T13:14:58+01:00

RFC 047: Changing the structure of the Catalogue API index

This RFC proposes a change to the structure of the Catalogue API index, which is used to store and retrieve documents for the Catalogue API.

Last modified: 2022-04-29T10:23:20+01:00

RFC 046: Born Digital in IIIF

This RFC is a proposal for how Wellcome can represent born digital archival material using IIIF.

Last modified: 2022-04-21T10:00:00+01:00

RFC 072: Transitive Sierra hierarchies

This RFC proposes a new stage in the Works pipeline to allow for transitive Sierra hierarchies.

Last modified: 2022-04-20T15:52:34+01:00

RFC 045: Work relationships in Sierra, part 2

Last modified: 2022-02-21T15:06:55+00:00

RFC 044: Tracking Patron Deletions

This RFC describes a proposal for tracking patron deletions in the Sierra API and removing corresponding records from Auth0.

Last modified: 2022-02-09T14:42:06+00:00

RFC 043: Removing deleted records from (re)indexes

This RFC proposes a change to the way we handle deleted source records in the Catalogue API index.

Last modified: 2021-07-26T14:26:24+01:00

RFC 040: TEI Adapter

This RFC proposes an adapter to harvest TEI files from GitHub and store them in a VersionedStore.

Last modified: 2021-06-24T17:11:32+01:00

RFC 042: Requesting model

This RFC describes how we will model the data for physical items in the catalogue API, so that users can find out how to access them.

Last modified: 2021-05-20T12:29:35+01:00

RFC 041: Tracking changes to the Miro data

This RFC describes a proposal for tracking changes to the Miro data, which is used to populate the Catalogue API.

Last modified: 2021-05-19T09:17:59+01:00

RFC 039: Requesting API design

This RFC describes the design of a new API for requesting items in the catalogue.

Last modified: 2021-04-26T12:30:14+01:00

RFC 038: Matcher versioning

This RFC describes a proposal for how to version works in the matcher/merger pipeline, to avoid issues with works becoming "stuck" in the pipeline.

Last modified: 2021-04-19T12:40:41+01:00

RFC 037: API faceting principles & expectations

This RFC describes the principles and expectations for how we expect the Catalogue API to behave in terms of faceting, filtering, and aggregating data. It aims to provide a clear and consistent framework for building a faceted search interface that can effectively handle multiple dimensions of data.

Last modified: 2021-03-24T12:01:02+00:00

RFC 036: Modelling holdings records

This RFC describes how we will model holdings records in the Catalogue API.

Last modified: 2021-03-03T13:24:57+00:00

RFC 035: Modelling MARC 856 "web linking entry"

This RFC describes how we will model MARC 856 "web linking entry" in the Catalogue API.

Last modified: 2021-02-24T14:15:02+00:00

RFC 032: Calm deletion watcher

This RFC describes a proposal for a Calm deletion watcher, which will allow us to detect deleted Calm records and update the VHS accordingly.

Last modified: 2021-02-09T17:03:48+00:00

RFC 034: Modelling Locations in the Catalogue API

This RFC describes how we will model locations in the Catalogue API, and how we will return them in the API.

Last modified: 2021-02-08T09:24:19+00:00

RFC 033: Api internal model versioning

This RFC describes a proposal for how to version the internal model used by the catalogue API, to allow for independent deployment of the API and the catalogue pipeline.

Last modified: 2021-02-01T09:17:45+00:00

RFC 031: Relation Batcher

This RFC describes a proposal for how to batch works in the relation embedder, to improve performance and reduce duplicate work.

Last modified: 2020-11-10T11:02:09+00:00

RFC 030: Pipeline merging

This RFC describes a proposal for how to merge works in the catalogue pipeline, to avoid issues with works becoming "stuck" in the pipeline.

Last modified: 2020-10-09T12:12:13+01:00

RFC 029: Work state modelling

This RFC proposes a new way of modelling works in the catalogue pipeline and API, separating the type of work from its state in the pipeline. This aims to improve composability, clarity, and ease of adding new types or states.

Last modified: 2020-09-07T14:40:36+01:00

RFC 027: Pipeline Intermediate Storage

This RFC describes a proposal for how to store intermediate works in the catalogue pipeline, to allow for more efficient and cost-effective processing of works.

Last modified: 2020-09-07T10:21:33+01:00

RFC 026: Relation Embedder

This RFC describes a proposal for how to denormalise relations between works in the catalogue pipeline, to improve the API response times and allow for richer queries.

Last modified: 2020-09-07T10:16:09+01:00

RFC 025: Tagging our Terraform resources

This RFC describes a proposal for how to tag our Terraform-managed resources, so we can find the corresponding Terraform configuration in the console.

Last modified: 2020-08-03T15:05:05+01:00

RFC 021: Data science in the pipeline

This RFC outlines a proposal for integrating data science services into the Wellcome Collection catalogue pipeline. The goal is to augment works and images with data inferred from them using data science techniques, such as feature vectors and colour palettes for images.

Last modified: 2020-07-29T11:05:28+01:00

RFC 026: Relevance reporting service

This RFC describes a proposal for a service that will allow us to test and report on the efficacy of our elastic-queries by comparing a set of search-terms and their respective expected results and ordering.

Last modified: 2020-07-20T12:18:54+01:00

RFC 022: Logging

This RFC describes a proposal for how we log from our services, and how we collect and search those logs.

Last modified: 2020-06-29T18:14:44+01:00

RFC 002: Archival Storage Service

This RFC proposes a service for storing archival and access copies of digital assets, ensuring long-term preservation and compliance with industry standards.

Last modified: 2020-06-01T08:46:31+01:00

RFC 023: Images endpoint

This RFC proposes an initial images endpoint for the catalogue API, allowing for searching and fetching images from the visual collections.

Last modified: 2020-05-06T15:53:14+01:00

RFC 024: Library management

This RFC describes a proposal for how we manage our libraries, so we can ensure they work together and are easy to discover.

Last modified: 2020-05-01T09:57:05+01:00

RFC 013: Release & Deployment tracking

This RFC proposes a new approach to tracking releases and deployments of services in the Wellcome Collection platform, moving away from the current reliance on Terraform for deployment. The approach described has been superseded by improvements in native AWS ECS deployment capabilities, but the tagging and tracking concepts remain relevant.

Last modified: 2020-04-08T14:40:00+01:00

RFC 019: Platform Reliability

This RFC proposes a set of actions to improve the reliability of the Wellcome Collection platform, based on a review of current issues and discussions with the team.

Last modified: 2020-03-20T15:57:38+00:00

RFC 020: Locations and requesting

This RFC describes a proposal for how to model item locations in the Catalogue API, and how to build a new Stacks API that allows users to request physical items.

Last modified: 2020-03-06T12:02:29+00:00

RFC 016: Holdings service

Last modified: 2020-03-03T16:46:23+00:00

RFC 018: Pipeline Tracing

This RFC outlines a proposal for adding distributed tracing to the Wellcome Collection catalogue pipeline. The goal is to improve debugging and monitoring of the pipeline by tracking the flow of data through it, from the adapters right through to ingest.

Last modified: 2020-01-29T10:36:52+00:00

RFC 011: Network Architecture

This RFC proposes a network architecture for Wellcome Collection services, ensuring effective security, maintenance, and scalability as the number of services grows.

Last modified: 2019-10-16T16:33:39+01:00

RFC 008: API Filtering

This RFC proposes a consistent approach to filtering and sorting resources in our APIs, ensuring a uniform developer experience.

Last modified: 2019-09-24T13:00:32+01:00

RFC 015: How we work

This RFC outlines a set of principles for how we work together as a team, including our approach to collaboration, communication, and decision-making.

Last modified: 2019-09-09T12:39:36+01:00

RFC 014: Born digital workflow

This RFC proposes an initial workflow for managing born-digital archives using Archivematica, which will be integrated with our new storage service.

Last modified: 2019-06-13T14:27:03+01:00

RFC 012: API Architecture

This RFC proposes a solution for serving Wellcome Collection APIs from a single domain, api.wellcomecollection.org, using AWS API Gateway and CloudFront.

Last modified: 2019-01-25T15:26:28+00:00

RFC 006: Reindexer architecture

This RFC proposes a new architecture for the reindexer, which is responsible for updating records in DynamoDB to trigger events for downstream applications.

Last modified: 2019-01-09T14:56:11+00:00

RFC 010: Data model

This RFC outlines the process and models used to create ontologies for Wellcome Collection's digital platform, focusing on a unified graph of linked data.

Last modified: 2019-01-09T14:56:11+00:00

RFC 009: AWS account setup

This RFC proposes a solution for breaking up the monolithic "wellcomedigitalplatform" AWS account into smaller, more manageable accounts, improving security and access control.

Last modified: 2019-01-09T13:53:56+00:00

RFC 007: Goobi Upload

This RFC proposes a new mechanism for uploading assets to Goobi workflows, replacing existing mechanisms with a more efficient and automated solution.

Last modified: 2018-11-02T16:46:57+00:00

RFC 005: Reporting Pipeline

This RFC proposes a reporting pipeline for the Wellcome Collection data, allowing for analytics and reporting on data from various sources.

Last modified: 2018-11-02T16:46:57+00:00

RFC 003: Asset Access

This RFC proposes a solution for restricting access to digital assets based on their access provisions and the authentication status of the viewer, while allowing these assets to be served via a CDN.

Last modified: 2018-11-02T16:46:57+00:00

RFC 001: Matcher architecture

This RFC proposes an architecture for the matcher and merger components of the reindexer, which are responsible for identifying and merging related works in the catalogue.

Last modified: 2018-11-02T16:46:57+00:00

RFC 004: METS Adapter

This RFC proposes a solution for ingesting METS files from the digitisation workflow software Goobi, converting them to JSON, and integrating them into the Wellcome Collection digital catalogue.

Last modified: 2018-11-02T16:46:57+00:00

Following on from the , this RFC outlines the requirements for the Catalogue Graph Ingestor to replace the existing Concepts Pipeline.

An update to the previous on the knowledge graph, focusing on a new graph data model for concept enrichment and linking to external ontologies.

This RFC is a continuation of the work started in .

This RFC proposes a now deprecated approach to building a holdings service, which has been superseded by the RFC.

Architecture decision record (ADR) document
What is an RFC?
How do I format an RFC?
RFC Listing
070-concepts-api-changes
071-python_builds
068-exhibitions-content-api
069-catalogue_graph_ingestor
Catalogue Graph pipeline
067-prismic-api-ids
066-graph_pipeline
064-graph-data-model
RFC #62
065-library-data-link-explorer
062-content-api-all-search
063-catalogue_pipeline_lambdas
062-knowledge-graph
061-content-api-next-steps
074-offsite-item-requesting
060-healthcheck-principles
059-splitting-pipeline-terraform
058-relevance-testing
073-prismic-api
055-genres-as-concepts
056-prismic-etl-pipeline
054-authority-vs-canonical-concept-ids
017-url_design
053-lambda-logging
051-concepts-adapters
052-concepts-pipeline
050-concepts-api
049-catalogue-api-aggregations-modelling
048-concepts-rfcs
047-catalogue-api-index-structure
046-born-digital-iiif
072-transitive-sierra-hierarchies
045-sierra-work-relationships
RFC-044: Sierra Series
044-patron-deletions
043-recording-deletions
040-tei_adapter
042-requesting-model
041-miro-data-changes
039-requesting-api-design
038-matcher-versioning
037-api-faceting-principles
036-holdings-records
035-marc-856
032-calm-deletions
034-location_location_location
033-api-internal-model-versioning
031-relation_batcher
030-pipeline_merging
029-work_state_modelling
028-pipeline-intermediate-storage
027-relation-embedder
025-tagging-our-resources
021-data_science_in_the_pipeline
026-relevance_reporting_service
022-logging
002-archival_storage
023-images-endpoint
024-library_management
013-release_deployment_tracking
019-platform_reliability
020-locations_requesting
016-holdings_service
020-locations_requesting
018-pipeline_tracing
011-network_architecture
008-api_filtering
015-how_we_work
014-born_digital_workflow
012-api_architecture
006-reindexer_architecture
010-data_model
009-aws_account_layout
007-goobi_upload
005-reporting_pipeline
003-asset_access
001-merger_matcher
004-mets_adapter