Using multiple storage tiers for cost-efficiency (A/V, TIFFs)
Last updated
Last updated
Within , we can store content as a mixture of Standard-IA and Glacier; this is primarily for cost efficiency. Storing objects in Glacier is than storing them in Standard-IA.
At time of writing (May 2023), there are two use cases for this feature:
Digitised A/V. Our digitised A/V workflow produces both a high-resolution MXF and a lower-resolution MP4.
The MP4 is the "access copy" β if somebody is watching the video through DLCS, itβs being transcoded from the MP4.
The MXF is the "preservation copy" β it's considered the canonical copy of the video and we could use it to create new access copies in the future, but it's too big to serve in a sensible way (some of the files are >100GB). We don't need immediate access to it.
We store the MP4 in Standard-IA and the MXF in Glacier.
Digitised manuscripts. In our digitised manuscripts workflow, we keep both the original TIFF and the edited JP2 from LayoutWizzard.
The JP2 is the "access copy" used by DLCS to serve images on the web
The TIFF is the "preservation copy" that we don't access on a day-to-day basis.
We store the JP2s in Standard-IA and the TIFFs in Glacier.
You can see the current set in in the bag tagger.
When the bag register finishes storing a bag, it sends a notification "We've successfully stored a new bag in space X
with identifier Y
and version Z
"
The bag tagger picks up this message, and applies key-value tags to certain objects in the newly stored bag, e.g. we add Content-Type: application/mxf
for our high-resolution MXF video files.
We set up S3 lifecycle configuration rules on our storage buckets to transition objects with certain tags into the Glacier storage tier, e.g. "Move any object with the tag Content-Type: application/mxf
to Glacier 90 days after it was created."