Archivematica @ Wellcome Collection
  • Introduction
  • High-level design
  • Storing born-digital files
    • Creating a transfer package
    • Upload a transfer package to S3
    • Check a package was stored successfully
    • Downloading a package from the storage service
    • Following a package in the dashboard
  • Service architecture
    • How does Archivematica work?
      • The Archivematica apps
      • Microservices, tasks and jobs
      • Gearman, ElastiCache, and the MCP server/client
    • How is our deployment unusual?
      • What are our extra services?
      • ECS containers on EC2, not Fargate
      • Why we forked Archivematica
    • How it fits into the wider platform
  • About our deployment
    • Using Wellcome catalogue identifiers
    • Different environments
    • Working storage: MySQL, Redis, and EBS
  • Administering Archivematica
    • Bootstrapping a new Archivematica stack
    • User management
      • How to add or remove users
      • Authentication with Azure AD
    • Upgrading to a new version of Archivematica
    • Running an end-to-end test
    • Clearing old transfers from the dashboard
  • Debugging Archivematica
    • Where to find application logs
    • Troubleshooting known errors
      • Timeout waiting for network interface provisioning to complete
      • 401 Unauthorized when the s3_start_transfer Lambda tries to run
      • "pull access denied" when running containers (and other ECS agent issues)
      • "Unauthorized for url" when logging in
      • "gearman.errors.ExceededConnectionAttempts: Exceeded 1 connection attempt(s)" in MCP server
      • NotADirectoryError in the Extract zipped transfer stage
    • Restarting services if a task is stuck
    • SSH into the Archivematica container hosts
Powered by GitBook
On this page
  • 1. Identify the correct bucket
  • 2. Identify the space
  • 3. Find the package in S3
  1. Storing born-digital files

Downloading a package from the storage service

PreviousCheck a package was stored successfullyNextFollowing a package in the dashboard

Last updated 2 years ago

If you store a package with Archivematica and you want to retrieve it later, you can download it from the storage service.

To do this, you need access to the underlying S3 buckets, e.g. using the storage-developer role or with AWS access keys configured in FileZilla Pro. If you don't have these, ask a developer in the #wc-platform-feedback channel in Slack.

1. Identify the correct bucket

  • If you uploaded the package to prod Archivematica, then you want to look in the wellcomecollection-storage bucket.

  • If you uploaded the package to staging Archivematica, then you want to look in the wellcomecollection-storage-staging bucket.

2. Identify the space

  • If you uploaded a catalogued born-digital package, then the space is born-digital.

  • If you uploaded a born-digital accession, then the space is born-digital-accessions.

3. Find the package in S3

Open the bucket; you should see a list of top-level folders, including born-digital, born-digital-accessions and digitised.

Click on the space you identified in step 2. You should see a list of packages:

Navigate to find the package you're looking for. If you have a hierarchical identifier like PPCRI/1/a, then you need to look in the corresponding folders – PPCRI, which should contain 1, which should contain a.

You’ll get to a folder containing folders like v1, v2, v3, and so on. These are the individual versions of a package. Pick the latest version, and download all the files it contains.

Note: if a version doesn't contain any files, then it's a "shallow update" in the storage service – it updated the metadata, not the files. You can retrieve the files by downloading a previous version of the package.