Archivematica @ Wellcome Collection
  • Introduction
  • High-level design
  • Storing born-digital files
    • Creating a transfer package
    • Upload a transfer package to S3
    • Check a package was stored successfully
    • Downloading a package from the storage service
    • Following a package in the dashboard
  • Service architecture
    • How does Archivematica work?
      • The Archivematica apps
      • Microservices, tasks and jobs
      • Gearman, ElastiCache, and the MCP server/client
    • How is our deployment unusual?
      • What are our extra services?
      • ECS containers on EC2, not Fargate
      • Why we forked Archivematica
    • How it fits into the wider platform
  • About our deployment
    • Using Wellcome catalogue identifiers
    • Different environments
    • Working storage: MySQL, Redis, and EBS
  • Administering Archivematica
    • Bootstrapping a new Archivematica stack
    • User management
      • How to add or remove users
      • Authentication with Azure AD
    • Upgrading to a new version of Archivematica
    • Running an end-to-end test
    • Clearing old transfers from the dashboard
  • Debugging Archivematica
    • Where to find application logs
    • Troubleshooting known errors
      • Timeout waiting for network interface provisioning to complete
      • 401 Unauthorized when the s3_start_transfer Lambda tries to run
      • "pull access denied" when running containers (and other ECS agent issues)
      • "Unauthorized for url" when logging in
      • "gearman.errors.ExceededConnectionAttempts: Exceeded 1 connection attempt(s)" in MCP server
      • NotADirectoryError in the Extract zipped transfer stage
    • Restarting services if a task is stuck
    • SSH into the Archivematica container hosts
Powered by GitBook
On this page
  1. Service architecture
  2. How does Archivematica work?

The Archivematica apps

PreviousHow does Archivematica work?NextMicroservices, tasks and jobs

Last updated 2 years ago

Archivematica is made up of seven different apps. This is a brief summary of those apps, and how they apply to our use case.

  • dashboard – the interface to Archivematica. This includes both the graphical component (i.e. the web dashboard) and the Archivematica API.

    It's used by humans to monitor the state of Archivematica transfers, and for machines to manage transfers.

  • storage service – another term for this might be "storage orchestrator" or "storage adapter". It provides a common interface to various storage backends, e.g. S3, DuraCloud, DSpace, so the rest of Archivematica can interact with various storage backends. This is where we've added code to interact with our storage service.

    • MCP Server decides what tasks need to be performed. It uses Gearman and Redis to store persistent information about tasks, to survive e.g. a restart.

    • MCP Client gets tasks from MCP Server (possibly via Gearman), and actually does the work. It may use other containers to help do its work, in particular FITS (for file format identification) and ClamAV (for virus scanning).

MCP services – these are the tasks that do the actual processing in Archivematica. See for more details.

Gearman, ElastiCache and the MCP server/client