Archivematica @ Wellcome Collection
  • Introduction
  • High-level design
  • Storing born-digital files
    • Creating a transfer package
    • Upload a transfer package to S3
    • Check a package was stored successfully
    • Downloading a package from the storage service
    • Following a package in the dashboard
  • Service architecture
    • How does Archivematica work?
      • The Archivematica apps
      • Microservices, tasks and jobs
      • Gearman, ElastiCache, and the MCP server/client
    • How is our deployment unusual?
      • What are our extra services?
      • ECS containers on EC2, not Fargate
      • Why we forked Archivematica
    • How it fits into the wider platform
  • About our deployment
    • Using Wellcome catalogue identifiers
    • Different environments
    • Working storage: MySQL, Redis, and EBS
  • Administering Archivematica
    • Bootstrapping a new Archivematica stack
    • User management
      • How to add or remove users
      • Authentication with Azure AD
    • Upgrading to a new version of Archivematica
    • Running an end-to-end test
    • Clearing old transfers from the dashboard
  • Debugging Archivematica
    • Where to find application logs
    • Troubleshooting known errors
      • Timeout waiting for network interface provisioning to complete
      • 401 Unauthorized when the s3_start_transfer Lambda tries to run
      • "pull access denied" when running containers (and other ECS agent issues)
      • "Unauthorized for url" when logging in
      • "gearman.errors.ExceededConnectionAttempts: Exceeded 1 connection attempt(s)" in MCP server
      • NotADirectoryError in the Extract zipped transfer stage
    • Restarting services if a task is stuck
    • SSH into the Archivematica container hosts
Powered by GitBook
On this page
  1. Debugging Archivematica

Restarting services if a task is stuck

Sometimes a task will get stuck in the Archivematica dashboard. A common debugging technique is to restart all the services, usually using the ECS console.

The MCP Client/Server tasks can get stuck if there's an issue with the MySQL database, e.g. if the database server has been rebooted:

OperationalError: (2006, 'MySQL server has gone away')

If you don't want to restart all the services, here are some notes on restarting individual services and the potential impact:

  • Restarting the MCP client tends to be okay. Not all tasks cope with being restarted – if the task doesn't expect to be run twice, you may fail the entire transfer/ingest; if so, you just have to retry the whole thing, sorry.

  • Restarting the MCP server is more disruptive, and seems to cause all in-flight transfers/ingests to be dropped. I've seen it get stuck once or twice, but it's unusual.

  • Restarting the Gearman server is probably fine (all the data should be in Redis), but I've never tried it. Gearman has been pretty robust and never been the source of issues.

PreviousNotADirectoryError in the Extract zipped transfer stageNextSSH into the Archivematica container hosts

Last updated 2 years ago