Data science

Information on data science and predictive analytics and how we use them for policing purposes.

Data science is the use of methods to obtain useful information from data, especially substantial amounts of data. We use data science tools and predictive models, to help develop and deploy advanced data techniques to support operational decision making.

For example:

using previous demand patterns to predict future demand so we can resource correctly

highlighting those who are at risk and only just emerging to police, so we can offer early support before things escalate, reducing harm caused to the person

Outputs of data science assist in helping officers to get a better understanding of risk or vulnerability. There is no automated decision making deployed to automatically intervene or make decisions. The data supports human decision making but is finalised by an officer.

Deployment of data science

Since 2013, we have deployed 40 to 50 models. Once a model is deployed, we track the effects of implementing and monitor the model at regular intervals to confirm it is performing as expected.

Depending on the product, a level of training is given to the officer, or user, for the model output, this is dependent on the level of use. If it is force wide, we will provide information on how the model works.

We regularly review the models and decommission or turn them off as appropriate, for example once they have served their purpose.

We share outputs with other agencies, mainly those that have requested support in an area and have shared data with us for this purpose. Bristol City Council share data with us and in return we have created exploitation models for them which we share back.

No data breaches resulting from use of data science tools have occurred within Avon and Somerset Constabulary.

External companies providing predictive analytics

We use the following external companies:

Qlik Sense – (procured in August 2016) primarily used as a visualisation tool to monitor demand, trends and harm. Currently, only a minority of applications used on the programme would be considered ‘predictive’ tools.

These external company applications do not make automated decisions, all outputs are reviewed alongside professional judgement.

Models and algorithms we use

We use a variety of models but mainly classification and forecasting.

For these we use different models depending on what works best for the project mainly decision trees and regression models.

All models and algorithms have been developed in house using our own staff and data. See our full list of models.

We do not use any image detection or categorisation software (usually to find explicit images).
None of our predictive analytics assess the likelihood for an individual to commit a crime. We provide a risk score to aid professional judgement in making a broader and more detailed assessment. None of our models calculate the likelihood or probability to undergo ant action or event.
Some of our tools make use of Artificial Intelligence (AI). AI is broad and includes anything that replicates the human thought process.
We use machine learning for forecasting demand. Machine learning is the use and development of computer systems that are able to learn and adapt without following explicit instructions, by using algorithms and statistical models to analyse and draw inferences from patterns in data.
We do not use commercial geodemographic segmentation products such as Experian’s MOSAIC, or CACIs ACORN (we have previously purchased data from ACORN to use in predictive analytics, but this was never used in data science).
Our algorithms will use the offender person record and others that exist within the source systems (including officers and staff). These will be used for related models and work. For our vulnerability and victim models, we use victim data.

Since 2013, the only cost for data science was for software SPSS Modeler and staff. We stopped using SPSS in June 2024.

Ethical considerations

Our data science ethics group consists of people with relevant business experience. The group reviews each product and project and advises if it should be referred to the Constabulary’s ethics committee. Any recommendations received by the group will be recorded, this is also the point in which a decision could be made to pause processing.

Data used or generated from a product will be fully understood to ensure that bias is not a factor in any output. We do not use any protected characteristics and we try and avoid information with human involvement, risk assessments or flags that do not have specific criteria. It is vital that individuals utilising products are trained appropriately to ensure fair processing and understand the existence of bias.

All our models follow ALGO-CARE, a decision-making framework for the deployment of algorithmic assessment tools in the policing context.

ALGO-CARE is based on the principle that algorithmic software in policing must be:

advisory
lawful
granular
ownership
challengeable
accurate
responsible
explainable