# Incident retro - wc.org intermittently available

**Incident from:** 2023-08-10

**Incident until:** 2023-08-10

**Retro held:** 2023-08-11

* [Timeline](#timeline)
* [Analysis of causes](#analysis-of-causes)
* [Actions](#actions)

## Timeline

See <https://wellcome.slack.com/archives/C01FBFSDLUA/p1691655733784049>

7 August 2023

Prismic model changes

UI change that included removing query for imageList model - merged to main 16.49\
Remove imageList from the model - merged to main 16.53\
? Model applied c16.55\
Nothing published - needs something published for the model to apply.\
Web site broke after something was published on 8 August<br>

8 August 2023

09.17 Report in #wc-platform-feedback “I'm getting a server error when I try to navigate to the homepage”

09.20 RC It seems to be back up now - is it for you?\
MD Hmm, no\
MD Ok, cleared my cache again. We're back\
RC we'll still investigate, thanks for flagging!

Then the web site intermittently showed server errors

09.22 RC The website was down for a bit, as we can see in the alerts channel, seems to be back up now? I'm going to say from 8:48.\
Still looks down on my 4G though, but up on my wifi, different servers issue?\
Edit: Looks fine on my 4G in incognito, so maybe just cache\
I couldn't see anything on [Prismic status](https://status.prismic.io/)

09.25 AG There's still an error even though the page loads. Not sure if it's one of these that just exist Hydration failed because the initial UI does not match what was rendered on the server.

RC That's a React error, I'm thinking more of a warning?

AG Down again Same A client-side exception has occurred

09.33 AC Cannot find slice of type imageList\
RC I deleted that yesterday they were all gone\
Let me deploy to prod

09.34 AC fourth line of the application logs, have we changed something here\
RC Then we know the fix\
I'll deploy the latest changes to production\
We made a lot of changes to the prismic model yesterday, that one's on me though

09.36 RC <https://buildkite.com/wellcomecollection/wc-dot-org-deployment/builds/2646> Should be < 10mins (edited)

09.40 RC End to ends are running but prod has been deployed so it should be fixed now

09.48 NP I've just got the error again. RC I can't see that error in our logs since 9:38

09.49 RC Maybe I'll try a lil cache clear

09:51 RC Right cache cleared, and still no logs since 9:38 that were related to that problem

09.54 RC We are still getting errors in the alerts channel though but I can't understand why, just looks like login logs.

09.55 RC It's the only thing I can see /account/api/auth/login?returnTo=\[redacted] and it's not even an error, just a log

09.56 AC more likely the log link is funky

09.58 AC so the list of failing errors comes from the CloudFront logs and then it makes a best-guess attempt at application logs

## Analysis of causes

* Prismic model changed but not pushed to prod; was expecting a field that didn’t exist (slice of type imageList)
* CloudFront errors confusing?

## Actions

**Paul/Alex**

* Widen time window of Kibana log link by adding an hour either side

**Raphaëlle**

* Create Prismic model change log page in Prismic that gets published with a change log and is the publish that’s needed when you change the model
* Modify tool to automatically update the Prismic model change log page
* Add to script: if your change contains queries is it in production?
* Investigate using fetch links only and removing graph queries


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.wellcomecollection.org/incident-retros/2023-08-10_wc.org_intermittently_available.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
