Request For Comments (RFCs)
  • Request for comments (RFC)
  • RFC 001: Matcher architecture
  • RFC 002: Archival Storage Service
  • RFC 003: Asset Access
  • RFC 004: METS Adapter
  • RFC 005: Reporting Pipeline
  • RFC 006: Reindexer architecture
  • RFC 007: Goobi Upload
  • RFC 008: API Filtering
  • RFC 009: AWS account setup
  • RFC 010: Data model
  • RFC 011: Network Architecture
  • RFC 012: API Architecture
  • RFC 013: Release & Deployment tracking
    • Deployment example
    • Version 1
  • RFC 014: Born digital workflow
  • RFC 015: How we work
    • Code Reviews
    • Shared Libraries
  • RFC 016: Holdings service
  • RFC 017: URL Design
  • RFC 018: Pipeline Tracing
  • RFC 019: Platform Reliability
    • CI/CD
    • Observability
    • Reliability
  • RFC 020: Locations and requesting
  • RFC 021: Data science in the pipeline
  • RFC 022: Logging
    • Logging example
  • RFC 023: Images endpoint
  • RFC 024: Library management
  • RFC 025: Tagging our Terraform resources
  • RFC 026: Relevance reporting service
  • RFC 026: Relation Embedder
  • RFC 027: Pipeline Intermediate Storage
  • RFC 029: Work state modelling
  • RFC 030: Pipeline merging
  • RFC 031: Relation Batcher
  • RFC 032: Calm deletion watcher
  • RFC 033: Api internal model versioning
  • RFC 034: Modelling Locations in the Catalogue API
  • RFC 035: Modelling MARC 856 "web linking entry"
  • RFC 036: Modelling holdings records
  • RFC 037: API faceting principles & expectations
  • RFC 038: Matcher versioning
  • RFC 039: Requesting API design
  • RFC 040: TEI Adapter
  • RFC 041: Tracking changes to the Miro data
  • RFC 042: Requesting model
  • RFC 043: Removing deleted records from (re)indexes
  • RFC 044: Tracking Patron Deletions
  • RFC 045: Work relationships in Sierra, part 2
    • Work relationships in Sierra
  • RFC 046: Born Digital in IIIF
  • RFC 047: Changing the structure of the Catalogue API index
  • RFC 048: Concepts work plan
  • RFC 049: Changing how aggregations are retrieved by the Catalogue API
  • RFC 050: Design considerations for the concepts API
  • 051-concepts-adapters
  • RFC 052: The Concepts Pipeline - phase one
  • RFC 053: Logging in Lambdas
  • RFC 054: Authoritative ids with multiple Canonical ids.
  • RFC 055: Genres as Concepts
  • RFC 056: Prismic to Elasticsearch ETL pipeline
  • RFC 058: Relevance testing
    • Examples of rank CLI usage
  • RFC 059: Splitting the catalogue pipeline Terraform
  • RFC 060: Service health-check principles
  • RFC 061: Content API next steps
  • RFC 062: Content API: All search and indexing of addressable content types
  • RFC 062: Wellcome Collection Graph overview and next steps
  • RFC 063: Catalogue Pipeline services from ECS to Lambda
  • RFC 064: Graph data model
  • RFC 065: Library Data Link Explorer
  • RFC 066: Catalogue Graph pipeline
  • RFC 067: Prismic API ID casing
  • RFC 068: Exhibitions in Content API
  • RFC 069: Catalogue Graph Ingestor
  • RFC 070: Concepts API changes
  • RFC 071: Python Building and Deployment
    • The current state
  • RFC 072: Transitive Sierra hierarchies
  • RFC 073: Content API
    • Content API: articles endpoint
    • Content API: Events endpoint
    • Content API: exhibitions endpoint
    • The future of this endpoint
  • RFC 074: Offsite requesting
    • Sierra locations in the Catalogue API
  • RFC 075: Using Apache Iceberg tables in Catalogue Pipeline adapters
Powered by GitBook
On this page
  • Background
  • Process
  • Models currently in use
  • Other models representing Wellcome domains largely work in progress
  • Authorities
  • Archives

RFC 010: Data model

PreviousRFC 009: AWS account setupNextRFC 011: Network Architecture

Last updated 10 days ago

This RFC outlines the process and models used to create ontologies for Wellcome Collection's digital platform, focusing on a unified graph of linked data.

Last modified: 2019-01-09T14:56:11+00:00

Background

As part of the platform development we create ontologies that describe our collections, events and editorial content as a unified graph of linked data. By using domain modelling and thinking about data as a semantic graph of typed entities and relationships, it helps us to create more richly linked digital experiences that aid exploration and discovery.

The ontologies are documented using OWL.

We think that this is the best way to formally describe a complex domain model, as it makes the semantics of our data self-documenting and widely shareable.

However, it is worth noting that although we use OWL to document the ontologies, we don’t actually store or process any data as. Our APIs do provide a context, which can be used to transform the data to an RDF model if required, but we consider them JSON-first.

These pages gives an overview of process and the documents describing the models.

Process

This enables us to:

  • Establish a shared language for things in the model.

  • Identify the scope of the model to describe the domain.

  • Design a consistent representation of the core bibliographic properties. For example bridging MARC and ISAD(g).

A csv file documents the list of properties identified in the domain modeling sessions. It also provides a summary of how this maps on to MARC fields and whether it has been implmented yet.

  • (https://github.com/wellcometrust/platform/raw/master/ontologies/WIP/list-of-transformation-tasks.csv)

Reference data:

  • Reference data for populating types in the data.

  • (https://github.com/wellcometrust/platform/tree/master/ontologies/Reference%20data)

Transformation:

  • The transformation file documents in detail how the transformation rule work.

  • (https://github.com/wellcometrust/platform/raw/master/ontologies/WIP/sierratransformable)

Models currently in use

Core model

A model describing common and non-domain specific classses and properties used across our ontologies.

Work model

A model describing library and archive works, including their physical items and relationships.

Location model

A model describing how museum and library items can be accessed.

Concept model

A model describing concepts, their classification and relationships.

Other models representing Wellcome domains largely work in progress

Article/Story model

A model describing editorial articles (stories) and their relationships.

Agency model

A model describing people, organisations and their relationships.

Public event model

A model describing museum and library events, exhibitions and installations.

Authorities

A mapping has been conducted from a candidate model for people and their positions.

  • (https://github.com/wellcometrust/platform/raw/master/ontologies/WIP/Person%20authority-Table.csv)

Example

  • (https://github.com/wellcometrust/platform/raw/master/ontologies/Examples/michael-ashburner-example.json)

Archives

Example of applying work model to a CALM archive record:

  • (https://github.com/wellcometrust/platform/raw/master/ontologies/Examples/ashburner-record-example.json)

We have used as an approach to identify a common model for describing works from different locations in the Wellcome Collection. The emphasise here is on the shared understanding and language between disciplines in the teams.

domain modeling
ttl file
ttl file
ttl file
ttl file
ttl file
ttl file
tt file