📦
Storage service
  • Introduction
  • How-to: basic operations
    • Ingest a bag into the storage service
    • Look up an already-stored bag in the storage service
    • Look up the versions of a bag in the storage service
  • How to: advanced usage
    • Getting notifications of newly stored bags
  • How to: debugging errors
    • Where to find application logs
    • Manually marking ingests as failed
  • Reference/design decisison
    • The semantics of bags, ingests and ingest types
    • How identifiers work in the storage service
    • How files are laid out in the underlying storage
    • Compressed vs uncompressed bags, and the choice of tar.gz
  • Developer information/workflow
    • An API reference for the user-facing storage service APIs
    • Key technologies
    • Inter-app messaging with SQS and SNS
    • How requests are routed from the API to app containers
    • Repository layout
    • How Docker images are published to ECR
  • Wellcome-specific information
    • Our storage configuration
      • Our three replicas: S3, Glacier, and Azure
      • Using multiple storage tiers for cost-efficiency (A/V, TIFFs)
      • Small fluctuations in our storage bill
      • Delete protection on the production storage service
    • Wellcome-specific debugging
      • Why did my callback to Goobi return a 401 Unauthorized?
    • Recovering files from our Azure replica
    • Awkward files and bags
    • Deleting files or bags bags from the storage service
Powered by GitBook
On this page
  1. Developer information/workflow

Repository layout

PreviousHow requests are routed from the API to app containersNextHow Docker images are published to ECR

Last updated 2 years ago

The code for the storage service isn't contained in a single repo; it's spread across multiple repos. This document lists the key repositories for the storage service, and how to find the code within them.

  • This contains:

    • Code for our Scala applications. For a guide to the projects within the repo, see the project guide.

    • Documentation for the storage service, in .

    • Infrastructure definitions in Terraform, in . This includes both the infrastructure for the Wellcome instance of the storage service and modules that can be used to run other instances of the storage service.

  • – some Scala code shared with other Wellcome services.

    This repo has a lot of the code that interacts directly with AWS services (S3, DynamoDB, SQS, etc.), and the storage-service uses more abstract traits -- so implementation details of those services don't leak into the applications.

    Any Scala in the weco namespace but not in the weco.storage_service namespace is defined in scala-libs.

  • – shared . These give us a consistent approach to deploying resources across all of our services (e.g. ECS tasks, SNS topics, SQS queues).

wellcomecollection/storage-service
the docs directory
the terraform directory
wellcomecollection/scala-libs
wellcomecollection-terraform-*
Terraform modules