Request For Comments (RFCs)
  • Request for comments (RFC)
  • RFC 001: Matcher architecture
  • RFC 002: Archival Storage Service
  • RFC 003: Asset Access
  • RFC 004: METS Adapter
  • RFC 005: Reporting Pipeline
  • RFC 006: Reindexer architecture
  • RFC 007: Goobi Upload
  • RFC 008: API Filtering
  • RFC 009: AWS account setup
  • RFC 010: Data model
  • RFC 011: Network Architecture
  • RFC 012: API Architecture
  • RFC 013: Release & Deployment tracking
    • Deployment example
    • Version 1
  • RFC 014: Born digital workflow
  • RFC 015: How we work
    • Code Reviews
    • Shared Libraries
  • RFC 016: Holdings service
  • RFC 017: URL Design
  • RFC 018: Pipeline Tracing
  • RFC 019: Platform Reliability
    • CI/CD
    • Observability
    • Reliability
  • RFC 020: Locations and requesting
  • RFC 021: Data science in the pipeline
  • RFC 022: Logging
    • Logging example
  • RFC 023: Images endpoint
  • RFC 024: Library management
  • RFC 025: Tagging our Terraform resources
  • RFC 026: Relevance reporting service
  • RFC 026: Relation Embedder
  • RFC 027: Pipeline Intermediate Storage
  • RFC 029: Work state modelling
  • RFC 030: Pipeline merging
  • RFC 031: Relation Batcher
  • RFC 032: Calm deletion watcher
  • RFC 033: Api internal model versioning
  • RFC 034: Modelling Locations in the Catalogue API
  • RFC 035: Modelling MARC 856 "web linking entry"
  • RFC 036: Modelling holdings records
  • RFC 037: API faceting principles & expectations
  • RFC 038: Matcher versioning
  • RFC 039: Requesting API design
  • RFC 040: TEI Adapter
  • RFC 041: Tracking changes to the Miro data
  • RFC 042: Requesting model
  • RFC 043: Removing deleted records from (re)indexes
  • RFC 044: Tracking Patron Deletions
  • RFC 045: Work relationships in Sierra, part 2
    • Work relationships in Sierra
  • RFC 046: Born Digital in IIIF
  • RFC 047: Changing the structure of the Catalogue API index
  • RFC 048: Concepts work plan
  • RFC 049: Changing how aggregations are retrieved by the Catalogue API
  • RFC 050: Design considerations for the concepts API
  • 051-concepts-adapters
  • RFC 052: The Concepts Pipeline - phase one
  • RFC 053: Logging in Lambdas
  • RFC 054: Authoritative ids with multiple Canonical ids.
  • RFC 055: Genres as Concepts
  • RFC 056: Prismic to Elasticsearch ETL pipeline
  • RFC 058: Relevance testing
    • Examples of rank CLI usage
  • RFC 059: Splitting the catalogue pipeline Terraform
  • RFC 060: Service health-check principles
  • RFC 061: Content API next steps
  • RFC 062: Content API: All search and indexing of addressable content types
  • RFC 062: Wellcome Collection Graph overview and next steps
  • RFC 063: Catalogue Pipeline services from ECS to Lambda
  • RFC 064: Graph data model
  • RFC 065: Library Data Link Explorer
  • RFC 066: Catalogue Graph pipeline
  • RFC 067: Prismic API ID casing
  • RFC 068: Exhibitions in Content API
  • RFC 069: Catalogue Graph Ingestor
  • RFC 070: Concepts API changes
  • RFC 071: Python Building and Deployment
    • The current state
  • RFC 072: Transitive Sierra hierarchies
  • RFC 073: Content API
    • Content API: articles endpoint
    • Content API: Events endpoint
    • Content API: exhibitions endpoint
    • The future of this endpoint
  • RFC 074: Offsite requesting
    • Sierra locations in the Catalogue API
  • RFC 075: Using Apache Iceberg tables in Catalogue Pipeline adapters
Powered by GitBook
On this page

RFC 017: URL Design

This RFC proposes a set of principles for designing URLs on wellcomecollection.org, ensuring they are persistent, user-friendly, and globally unique.

Last modified: 2022-12-09T15:25:52+00:00

Context

URLs are part of the user's online experience--they affect a user's ability to reliably share and reference online resources and discover and interact with the content and services we produce.

They can also have an impact on the amount of long-term technical debt we carry since getting the URL scheme wrong means we need to maintain redirects for the duration of the site.

We therefore follow the following principles:

  1. URLs are designed to be persistent—this is the most important thing to consider when minting new URLs or devising a URL scheme. In practice, this means that URLs MUST NOT include:

    a. reference to specific technology; b. dates unless the URL is about a date; c. status (old/new/draft etc.) d. subject unless the URL is about that subject

  2. If a URL changes we SHOULD redirect to a new URL and maintain that redirect forever where a semantically equivalent resource exists;

  3. When content is removed and there is no equivalent resource we SHOULD return 410 (HTTP Gone) with a link to an archived copy of the resource.

  4. URLs SHOULD be as short as possible but no shorter. Deeply nested URLs have an impact on SEO and long URLs can cause problems when used in emails etc. short URLs are therefore preferable while maintaining sufficient entropy to support future growth.

  5. There MUST be one URL per thing and all things have a URL. URLs are there to identify things on the Web and people use URLs to point to those things, this means:

    a. all resources MUST have a unique URL; b. URLs can't be used to identify two or more resources; c. all fragments SHOULD dereference i.e. .../foo#bar should be addressable at .../foo/bar d. fragments (anything after a #) don't count as unique URLs; e. hash-bang URLs (#!) and other techniques that rely on client side JS MUST NOT be used.

  6. URLs are globally unique. A user MUST be able to share a URL and anyone, anywhere in the world MUST be able to de reference the same resource.

  7. URLs can identify: things, lists of things and forms. Query parameters SHOULD be avoided for anything that’s not a list.

  8. URLs should use nouns (never verbs) - URLs are for identifying things.

  9. The base of a URL path should be a plural (e.g. stories) - it identifies the collections of things. The resource can be singular.

  10. A resource can be a singleton or a collection e.g. /stories/$storyID (the URL for a story) or /stories/by/formats/$format (the URL for all stories of a specific format)

  11. URLs SHOULD be hackable. A user should be able to hack back a URL and get a broader set of resources e.g. it should be possible to hack back the URL for a story: .../stories/$story to .../stories/ and be returned a list of all stories or .../stories/by/formats/ for all story formats or ...stories/by/date/yyyy/mm/dd should hackable to return all stories published on the year, month or day.

  12. URLs MUST NOT include any personally identifiable information, tracking parameters nor state.

  13. All content MUST be served over https

  14. URLs MUST be designed alongside the user interface and given the same level of care as any other UI component (possibly more because they are harder to change). We SHOULD try to have beautiful URLs

We are publishing a website not a book - make links, link between things and make those links hackable (whther or not they are linked to yet).

PreviousRFC 016: Holdings serviceNextRFC 018: Pipeline Tracing

Last updated 10 days ago