Collecting data
Collecting feedback and data on how our services are being used helps iterate and improve them over time.
While this insight from behavioural data is valuable, we don't believe that bigger data is necessarily better. Our philosophy is it would be foolish to start collecting data without first establishing which questions we wanted to answer, and wrong to collect data that we don't need. For example, we see no need to personalise users' search results so our search logs are kept entirely anonymous.
We restrict the data we collect only answer specific questions that we have. This allows us to iterate quickly while limiting risks to the people using our services.
What we track is primarily split into two interactions
What has a person searched for
How have the interacted with the results
Examples of the data we store for these are
Identification and anonymisation
We store no personably identifiable information with each interaction collected.
We do store if an request was made from within Wellcome's network.
We label each interaction with an anonymous ID from on Segment.
Storage
Data is collected on the frontend via Segment's analytics.js, sent to a kinesis stream, and then stored in Elasticsearch.
We currently retain anonymised data in perpetuity.
This document does not include general data collection across wellcomecollection.org, but for work on the catalogue search.
Last updated