📦
Storage service
  • Introduction
  • How-to: basic operations
    • Ingest a bag into the storage service
    • Look up an already-stored bag in the storage service
    • Look up the versions of a bag in the storage service
  • How to: advanced usage
    • Getting notifications of newly stored bags
  • How to: debugging errors
    • Where to find application logs
    • Manually marking ingests as failed
  • Reference/design decisison
    • The semantics of bags, ingests and ingest types
    • How identifiers work in the storage service
    • How files are laid out in the underlying storage
    • Compressed vs uncompressed bags, and the choice of tar.gz
  • Developer information/workflow
    • An API reference for the user-facing storage service APIs
    • Key technologies
    • Inter-app messaging with SQS and SNS
    • How requests are routed from the API to app containers
    • Repository layout
    • How Docker images are published to ECR
  • Wellcome-specific information
    • Our storage configuration
      • Our three replicas: S3, Glacier, and Azure
      • Using multiple storage tiers for cost-efficiency (A/V, TIFFs)
      • Small fluctuations in our storage bill
      • Delete protection on the production storage service
    • Wellcome-specific debugging
      • Why did my callback to Goobi return a 401 Unauthorized?
    • Recovering files from our Azure replica
    • Awkward files and bags
    • Deleting files or bags bags from the storage service
Powered by GitBook
On this page
  • 1. A warm replica in S3
  • 2. A cold replica in S3
  • A cold replica in Azure
  1. Wellcome-specific information
  2. Our storage configuration

Our three replicas: S3, Glacier, and Azure

PreviousOur storage configurationNextUsing multiple storage tiers for cost-efficiency (A/V, TIFFs)

Last updated 1 year ago

We have three replicas for the storage service content:

1. A warm replica in S3

This is an S3 bucket in the storage AWS account, in Amazon's eu-west-1 (Ireland) region. Objects are stored in a mixture of the Standard-IA and Glacier storage classes and versioning is enabled.

This is the copy intended for day-to-day access.

Developers get access to these buckets as part of their standard AWS account permissions, but note there are specific IAM exclusions to prevent us from modifying objects in the prod bucket.

You can access these buckets using the AWS CLI or the AWS console.

  • Prod:

  • Staging:

2. A cold replica in S3

This is an S3 bucket in the storage AWS account, in Amazon's eu-west-1 (Ireland) region. Objects are stored in the Glacier Deep Archive storage classes and versioning is enabled.

This is the copy intended for disaster recovery.

Developers get access to these buckets as part of their standard AWS account permissions, but note there are specific IAM exclusions to prevent us from modifying objects in the prod bucket.

  • Prod:

  • Staging:

A cold replica in Azure

This is the copy intended for worst-case disaster recovery. It's stored in a different geographic location and service provider, to minimise the risk of a problem affecting all three copies at once.

You can only get access to these containers by asking D&T, and we don't grant access to it by default. Ideally there should be nobody who has write access to all three replica locations, to reduce the risk of somebody inadvertently deleting all three copies of an object.

This is an Azure Blob container in the D&T account, in Azure's West Europe (Netherlands) region, where blobs are stored in the Archive storage tier. The containers have a policy applied, and blobs are stored in the .

The storage service accesses these containers using a . These are signed URIs that we keep in Secrets Manager; note that they're tied to the external IP address of the NAT Gateway in the storage account, so you can't use them locally.

Prod: , in the wecostorageprod storage account, in the rg-wcollarchive-prod resource group

Staging: , in the wecostoragestage storage account, in the rg-wcollarchive-stage resource group

wellcomecollection-storage
wellcomecollection-storage-staging
wellcomecollection-storage-replica-ireland
wellcomecollection-storage-staging-replica-ireland
legal hold
archive access tier
shared access signature (SAS)
wellcomecollection-storage-replica-netherlands
wellcomecollection-storage-staging-replica-netherlands