This RFC is a proposal for how Wellcome can represent born digital archival material using IIIF.
The extensible nature of IIIF means that that this is not a particularly difficult technical challenge, with many possible models, but there are considerable implications for interoperability. The work is intended to be generally useful to others beyond Wellcome, and should not cause problems for people wishing to consume IIIF from publishers that have a mixture of born digital and digitised content, especially when consumers can't be expected to know beforehand whether a particular Manifest contains born digital material or not.
Rather than discuss these concerns in the RFC, there is an accompanying document:
The approach described in this RFC ensures that any Manifest adopting it will still produce valid user interface in a standard, compliant IIIF viewer: it will not cause errors or appear to be in some way broken.
While a standard viewer won't be able to render the born digital content directly, it should still render links to that content, without requiring any modification. That is, loading a Wellcome Born Digital Manifest into UV or Mirador (for example) will not prevent a user from finding the born-digital resource and viewing it in their own software, via the user interface generated by the viewer.
A Manifest that represents an object comprising multiple files, arranged in a directory structure, will produce a navigable representation of that structure via IIIF Ranges (if the viewer supports ranges).
Where a Manifest contains a mixture of formats, some of which are supported by current IIIF and some are not, the supported formats will render as expected in standard, compliant IIIF viewers. This allows born digital content to be mixed with image or AV content in the same Manifest.
The approach leaves open the possibility of rendering the resource directly in a viewer (e.g., render a PDF or a spreadsheet), as well as providing multiple alternative renderings - these are decisions that client software can make based on the information in the Manifest (e.g., on wellcomecollection.org the work page can decide whether it invokes its "Word renderer", should it ever have one - the IIIF model doesn't change).
Canvas as both carrier and placeholder
This approach retains the Canvas as the carrier of content. We still assume that all Canvases continue to be renderable in a standard, compliant viewer. The publisher needs to provide something for a standard client to render - usually an image - while indicating that the significant content is somewhere else, and that the rendered Canvas is a placeholder or billboard for that content.
At its very simplest this placeholder content could be a stock image, a simple notice that the user should look somewhere else in the Manifest for the "real" content. Publishers can choose to put anything in that image; if they have a suitable image representation of the born digital content, they can use it here. The key point is that they must provide something; the items property of a Canvas cannot be empty, and it must contain one AnnotationPage with at least one annotation with the motivation painting.
Extensions: two new behavior values
As always the names of these two new behaviors are tricky to get right - consider them provisional. placeholderCanvas already exists as a property name in IIIF v3, so there is a deliberate echo of that.
This approach provides a custom behavior property of placeholder for the Canvas, to indicate to an aware client that while the painting annotations on the Canvas may be rendered, they aren't the significant content, and can be ignored if the client knows what to do with the significant content.
The Canvas uses the rendering property, as now, to link to other representations, including the actual born digital file (the real, significant content), as well as any other versions or surrogates the publisher wants to provide (e.g., link to a WordPerfect document, but also link to PDF and text representations).
This approach provides a custom behavior property of original for a resource linked via rendering, to indicate that it is the significant content.
As an abstract representation of structure, the IIIF Ranges do not need additional behavior properties defined; they would use label to indicate (for example) that a Range represents a folder; their purpose is to generate visual navigation hierarchy without further semantics. Wellcome know that for their born digital archive content, these ranges mean folders - how the UI reflects that shouldn't affect the model here.
As always publishers can further extend the model with their own additional behavior and properties.
The two behavior properties introduced above are defined in a JSON-LD @context document, whose location can be decided later. It comes first in the Manifest, before the Presentation 3 context.
Note that while the Manifest itself could have the rendering property, this approach puts it on the Canvas, because one Canvas is associated with one original file, and extends naturally to multiple Canvases each having their own rendering property, as in the next example.
Example: representing structure
The following example is brief, for the sake of this RFC, but demonstrates how the original file layout might be conveyed, as well as including an image that can be presented as a completely normal Canvas.
The original arrangement would have been like this and included an empty folder:
/draft
- My Notes 1.doc
- My Notes 2.doc
/ inspiration
/final
- Essay.doc
- illustration.jpg
This section is very tentative but is worth considering at the same time as the above.
The idea of profiles has been present in the IIIF Image API from the start, and they prove useful to quickly identify what features an image service offers. While the idea of profiles for the Presentation API has come up a few times, it has never got very far due to the complexity of any Presentation "feature matrix".
However, given the born digital approach above, it might be useful to consider profiles for Manifests that are restricted to describing the type of content that appears on the Manifest's canvases, and no other features of the Manifest. One Manifest could have multiple profiles (in fact this would be common), so these may work better as custom behavior properties, as above.
profile
description
simpleImage
At least one of the Manifest's canvases has a single painting annotation where the body is an Image resource and the target is the whole canvas
simpleImageAPI
The same as simpleImage, but at least one of the images has an Image service
simpleAudio
At least one of the Manifest's canvases has a single painting annotation where the body is an Audio resource and the target is the whole canvas
simpleVideo
At least one of the Manifest's canvases has a single painting annotation where the body is a Video resource and the target is the whole canvas
complexImage
At least one of the Manifest's canvases has multiple painting annotations where the bodies are Image resources and the targets may be the whole canvas or may be parts (regions) of it
complexAudio
At least one of the Manifest's canvases has multiple painting annotations where the bodies are Audio resources and the targets may be the whole canvas or may be parts (duration extents) of it
complexAV
At least one of the Manifest's canvases has multiple painting annotations where the bodies are Audio and/or Video resources and the targets may be the whole canvas or may be parts (duration extents) and/or regions.
accessControlled
At least one resource linked by a painting annotation is access controlled via the IIIF Auth API
placeholder
At least one Canvas in the Manifest has the placeholder behavior described above.
Most current published Manifests would have the simpleImageAPI behavior.
Many Wellcome Manifests (most of the current archive material) would have simpleImageAPI and accessControlled behaviors.
The second Manifest example above would have simpleImageAPI and placeholder behaviors.
Many Wellcome Manifests are simpleAudio or simpleVideo, and many of those also have the accessControlled behavior.
Background
2021 Affinity Group https://docs.google.com/document/d/1Am8fcO5A2JTR56cPoGdu5Ny0Yp_0hq-EsBIUDHixoXA/edit#
Tom Crane's original Born Digital document https://docs.google.com/document/d/1ir-b99Mq7t3_PspOrORSiJW2BKoYxsw8E-h9ddyraGo/edit#
Why the above approach? https://docs.google.com/document/d/1007W_PpQefrFhMrF7mIZMiGNDGG6hbJMsa-XX6kEl9A
Appendix: REJECTED APPROACHES
Manifest items property can take any resource
The simplest thing would be to allow the manifest.items property to contain resources that were not canvases - like this (massively simplified):
{"type":"Manifest","items": [ {"type":"Canvas" }, {"type":"Text","format":"application/x-msword" }, {"type":"Scene"// or whatever the 3D approach produces, perhaps still Canvas } ]}
This approach would result in some Manifests working outside of Wellcome's controlled environment, and some not, and users being none the wiser as to why - especially two adjacent items in the same archival subseries.
It is easier - it removes the requirement to publish something renderable. But it's not preferable; the MVP for "publish something renderable" may not be that useful to users either, but it doesn't break viewers and tools. And there is room to be more sophisticated in that placeholder - e.g., rasterising an image of a PDF.
Alternative sequence
This approach is similar to Wellcome's initial IXIF model, before there was AV support in IIIF. It's more suited to the IIIF v2 model.
In this approach there was a new top-level property of the Manifest called mediaSequences (mirroring the existing sequences property). This took resources rather than canvases.
In IIIF v3 the primary child collection on any resource is accessed by a property uniformly called items - and there is only one sequence (a Manifest can't have multiple sets of items, if you want the edge case of alternative sequencing, do it with Ranges).
The proposed approach retains .items and is much more compatible.