Archivematica @ Wellcome Collection
  • Introduction
  • High-level design
  • Storing born-digital files
    • Creating a transfer package
    • Upload a transfer package to S3
    • Check a package was stored successfully
    • Downloading a package from the storage service
    • Following a package in the dashboard
  • Service architecture
    • How does Archivematica work?
      • The Archivematica apps
      • Microservices, tasks and jobs
      • Gearman, ElastiCache, and the MCP server/client
    • How is our deployment unusual?
      • What are our extra services?
      • ECS containers on EC2, not Fargate
      • Why we forked Archivematica
    • How it fits into the wider platform
  • About our deployment
    • Using Wellcome catalogue identifiers
    • Different environments
    • Working storage: MySQL, Redis, and EBS
  • Administering Archivematica
    • Bootstrapping a new Archivematica stack
    • User management
      • How to add or remove users
      • Authentication with Azure AD
    • Upgrading to a new version of Archivematica
    • Running an end-to-end test
    • Clearing old transfers from the dashboard
  • Debugging Archivematica
    • Where to find application logs
    • Troubleshooting known errors
      • Timeout waiting for network interface provisioning to complete
      • 401 Unauthorized when the s3_start_transfer Lambda tries to run
      • "pull access denied" when running containers (and other ECS agent issues)
      • "Unauthorized for url" when logging in
      • "gearman.errors.ExceededConnectionAttempts: Exceeded 1 connection attempt(s)" in MCP server
      • NotADirectoryError in the Extract zipped transfer stage
    • Restarting services if a task is stuck
    • SSH into the Archivematica container hosts
Powered by GitBook
On this page
  1. Service architecture
  2. How is our deployment unusual?

What are our extra services?

PreviousHow is our deployment unusual?NextECS containers on EC2, not Fargate

Last updated 2 years ago

We've written several of our own services which sit around Archivematica.

The watches for uploads to the S3 transfer bucket. It checks that new transfer packages are correctly formatted, and if so, it sends them to Archivematica for processing. It uploads a feedback log explaining if the package was accepted.

  • For archivists, this means they can start processing a transfer package by uploading it to S3, rather than using the Archivematica dashboard.

  • For the platform team, this means we can do some checks on packages before they're sent to Archivematica (e.g. that the metadata has been supplied correctly).

We can then monitor that package being processed by Archivematica.

Any packages created this way are stored in a special testing space in the storage service, so they can be distinguished from real content.

  • If a package has been successfully stored, it deletes the copy in the source bucket

  • If a package hasn't been successfully stored, it leaves the package as-is and logs a warning

It posts its results to the #wc-preservation channel in Slack, so we're alerted of any packages that didn't store correctly.

The gives us a way to do end-to-end testing of Archivematica. When you run it, it creates and uploads a new transfer package to the S3 bucket. This simulates the behaviour of an archivists.

The sends notifications of newly-stored born-digital bags to an SNS topic. This tells iiif-builder about new born-digital content, and allows it to create a IIIF Presentation manifest for this archive.

The monitors the state of transfer packages in Archivematica. In particular, once a week it scans for new transfer packages in the transfer source bucket, and checks if they're in the storage service.

start_test_transfer Lambda
born-digital listener
transfer monitor
s3_start_transfer Lambda