Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
MSc Module in Public Health Data Science ASSESSMENT
The Annual Report of the Chief Medical Office 2018 made a case for health as one of the primary assets of the UK, contributing to both the economy and happiness. The report described how we must measure and track progress in our development of health as a nation and our fairness as a society in delivering improving health outcomes.
The report called for a composite Health Index to be developed (see the Chief Medical Officer’s summary, particularly Chapter 1, page 2 link below). Your Public Health Data Science assignment is to create a composite Health Index using PHE Fingertips data disaggregated to small areas (eg. Local Authority or Middle Super Output Areas) in England. As per the CMO report, the index should:
● Be inclusive of health outcome measures, modifiable risk factors and the social determinants of health;
● Be able to be disaggregated by composition and geographical area allowing tracking of performance of each component additional to the overall metric;
On the first face-to-face day you will be given a chance to discuss the assessment in detail. You will then work in groups for the day to begin to tackle the creation of the index along with other tasks that we will set. Whilst you will work on this problem together for the day, it is essential that your final code and report submitted is entirely your own work. The deadline for the coursework submission is Mon 29 April - 12:00pm. Submissions will be made via Moodle.
We would like you to submit an R Markdown file that contains the analysis in addition to a Report that has a maximum of 1,500 words of text (not including analytical code, tables, results or Figures) and contains the following sections:
1. Background
a. A statement of the context and setting
2. Aims
a. A description of the aim for the index you have developed
3. Methods:
a. Description of data items included in the index
b. Description of the methods for creation of the index
4. Results
a. Presentation of the overall index
b. Presentation of disaggregated index
5. Discussion
a. Interpretation of results, including their strengths and weaknesses
b. How the index could be used to improve the health of the public by conducting one of the following:
i. A brief stakeholder analysis
ii. A brief media engagement and dissemination strategy
Public Health Data Science-(CHME0017)
The Health of a Nation
Developing a Composite Health Index for England
Background
In 2018 the Chief Medical Officer called for the development of a Composite Health Index (CHI) to be developed in order to quantify health as a national asset and to track changes in health over time (Davies 2018).
Dahlgren & Whitehead’s (1992) model suggests that health is determined by genetics, life stage, individual risk factors and behaviours, wider social determinants and the interaction between all of these. The variation in risk factors and social determinants is closely linked to deprivation, with the least deprived people having a life expectancy almost 10 years longer than the most deprived, and can expect almost 20 years more without living with a disability (Marmot and Bell 2012). Social determinants of health are mutable with changes in policy within local and central government, growth of GDP, and changes in the distribution of
wealth. Risk factors can be modified through policy and societal change. Health outcomes are a product of social determinants of health and individual risk factors’ interaction with health services, and so health service performance is included.
Tackling health inequalities requires understanding social determinants of health, risk behaviours, and health outcomes, and tracking change over time across the whole population.
Aims
This CHI will form a baseline measure of health from 2018 comprising of three component parts:
● Social determinants of health
● Individual risk factors and behaviours
● Health outcomes
The CHI will be presented by geographical area as to allow policy-makers to better understand the needs of their local populations and target interventions effectively.
Data Items
The CHI is based on academic theory as to the core components of health across the life course. Data items were selected to fulfill the definitions of social determinants, risk factors, and health outcomes defined in Marmot and Bell (2012) (Table 1).
Table 1: Component parts of the determinants of health
Social Determinants |
Risk Factors |
Health Outcomes |
Early years |
Alcohol |
Loss of years of life |
Education |
Smoking |
Loss of years of healthy life |
Work |
Obesity |
Economic costs |
Income |
Drug Use |
|
Communities |
|
|
Data items were selected from publicly available indicators that will be reproduced over time to satisfy the components. This will allow benchmarking to current levels. A search was carried out using the terms in Table 1 on Public Health England’s (PHE) Fingertips platform and indicators were selected from a variety of profiles to provide coverage (Appendix 1).
Some indicators are replicated in more than one component as they are important to more than one component. For example, infant mortality is important to both early years and loss of years of life and so it is included in both social determinants and health outcomes components.
Added to the social determinants component were data from the Indices of Multiple
Deprivation (IMD) subdomains. These subdomains are designed to quantify different forms of deprivation that impact people’s lives in various ways. The subdomains are included as there is large geographic variation in performance across the subdomains that warrants
extra granularity provided by the subdomain scores (Senior, 2019). For example, Wokingham has the lowest deprivation in the ‘income’ subdomain, but has high ‘geographical barriers’ deprivation.
Most recent data were used, however this was not consistent as to when it was produced as different indicators have differing reproduction schedules and timeframes. This could
introduce some confounds into the analysis as we are not always comparing performance within the same temporal period. The alternative was to limit our analysis to indicators that
are produced over the same period, however this would have led to omitting potentially
valuable information. For example, suicide rate is an important health outcome measure that has to produced over a three year period to not be disclosive.
Methods
Data items were retrieved using the FingertipsR package for PHE indicators, and using
webscraping to retrieve IMD data. IMD data were segregated at Lower Super Output Area geographies; these were aggregated up to Upper Tier Local Authority (UTLA) using Office for National Statistics lookup tables. Population weighted subdomain totals were used to calculate z-scores as a measure of geographical dispersion around the England mean.
Fingertips data were filtered to retrieve the most recent data points for each indicator, and
also filtered to exclude any breakdowns by sex and limited to UTLAs. After this, UTLAs with over 50% missing data were excluded, which led to the removal of two areas; City of London and the Isles of Scilly. Indicators with more than 20% data missing were removed from the analysis as they are deemed to be incomplete.
Missing data from the remaining indicators were imputed using multiple imputation. Data are
fit to 4 Bayesian generalized linear models over 30 iterations, and missing values predicted using these models, with the best fit being selected for imputation (Su et al. 2011).
Heatmaps of the spread of the data before and after imputation are shown in Figure 1.
Normality of distribution within the indicators was assumed using the Central Limit Theorem. Indicators were tested for covariance and indicators using a pairwise correlation matrix with those indicators with an R! ≥ 0.9 selected for removal, with the most globally correlated indicator being removed (Table 2).
Z-scores were calculated for all indicators, and mean z-scores and ranks calculated for
components, polarity switched where necessary. These were then aggregated into the CHI giving equal weighting to each of the components. Summary statistics for each indicator
were output as a baseline measure for future years (Appendix 4).
a)
b)
c)
Figure 1: Heatmaps showing spread of data, missing data and imputed data for a) Social Determinants b) Risk Factors, and c) Health Outcomes
Table 2: Indicators from each component removed for covariance reasons
Social Determinants |
Risk Factors |
Health Outcomes |
|
Smoking attributable mortality CHD: Recorded prevalence (all ages) |
Under 75 mortality rate from cardiovascular diseases considered preventable Under 75 mortality rate from cancer |
|
|
Under 75 mortality rate from cancer considered preventable |
|
|
Under 75 mortality rate from liver disease |
|
|
Under 75 mortality rate from respiratory disease |
Hip fractures in people aged 65 and over
Mortality rate from causes considered preventable
Results
A CHI based on publicly available, national, long-term data was created based on three components; social determinants of health, individual risk factors, and health outcomes (Appendix 3). The contribution of individual indicators to each component was calculated, and each component was given equal weighing in the construction of the CHI. This index was mapped to UTLAs to give an overall view of the distribution of health across England (Figure 2).
Figure 2: The CHI total mapped to UTLAs in England. Lower Mean Z Scores indicator lower health in that geographical area.
A total of 79 Fingertips indicators and 16 IMD subdomains included after exclusions for missing data and collinearity (Appendix 2). It is possible to disaggregate the CHI into each component (Figure 3) and individual indicators. The plotted outputs are designed to highlight the areas with greatest need.
a) Social Determinants b) Risk Factors c) Health Outcomes
Figure 3: Component parts of the CHI mapped to UTLAs in England. a) The social determinants component, b) the risk factors component and c) the health outcomes component. NB. The scale for these plots is the same as Figure 2.
Rankings were also produced for the CHI and its components, primarily as a tool to validate the index against other indices that use differing methodologies. Outputs of the top 5 ranked areas (most need) are shown in Table 3.
Table 3: CHI and component scores (ranks) for the 5 areas identified as having the greatest need.
Area |
CHI |
Social Determinants Component |
Risk Factors Component |
Health Outcomes Component |
Blackpool |
-1.132 (1) |
-0.911 (2) |
-0.959 (3) |
-1.53 (1) |
Kingston upon Hull |
-1.079 (2) |
-0.931 (1) |
-1.055 (2) |
-1.241 (2) |
Knowsley |
-0.885 (3) |
-0.871 (3) |
-0.757 (6) |
-1.027 (5) |
Middlesbrough |
-0.873 (4) |
-0.749 (7) |
-0.764 (5) |
-1.107 (3) |
Hartlepool |
-0.961 (5) |
-0.511 (12) |
-1.091 (1) |
-0.961 (8) |
A comparative analysis were undertaken to verify the validity of CHI. Health Adjusted Life Expectancy (HALE) estimates, produced as part of the Global Burden of Disease study (for Health Metrics and Evaluation 2015), was selected as a comparator. Ranks were compared using a Spearman rank correlation test. There was a significant strong positive correlation between CHI and HALE (rs = 0.89, p < 0.0001).
To track change over time, the means and standard deviations from this analysis are to be used in future z-score calculations to ensure the baseline stays stable. To this end, descriptive statistics were output for reference in future years (Appendix 4).
Discussion
A CHI was created to quantify, and track changes to, health across England. The outputs
from this index appear to be in line with alternative measures of overall health, and also in
line with previous literature on health inequalities. CHI shows a clear rural/urban divide that has been previously described (Connolly, O’Reilly, and Rosato 2007; Doran, Drever, and
Whitehead 2004). There is also a north/south divide viewable if a line is drawn from the
Bristol channel to the Wash, as has been previously described (Kontopantelis et al. 2018). In addition to this, outputs from CHI may provide evidence for the ‘healthy London effect’,
where London overperforms in health outcomes given the social determinants present (Figure 4) (Minton and McCartney 2018).
a) b)
Figure 4: A comparison of the London boroughs performance in the a) social determinants and b)
health outcomes components of CHI. There are lower than average z-scores for social determinants and above average z-score for health outcomes. This is in alignment with the ‘healthy London effect’ phenomenon.
CHI highlights previously described health inequality divides and strongly correlates with
alternative measures. This suggests that CHI is a robust measure of health across England that can be used as a baseline measure for future reference and evaluation. Each indicator is given an equal weighting within the construct of each component, and each component has an equal weighting in CHI. There might an increase in accuracy with a more
sophisticated weighting mechanism in future. There is also a risk of changes to the methods used in creating the indicators used in CHI, and this would have to be accounted for in future years. Indicators have been included that have different update schedules, which means that there may be a lag in CHI reflecting changes in local areas. Effort should be made to
replicate CHI at clinical geographies, such as Clinical Commissioning Groups, to allow more targeted policy action.
Figure 5: Stakeholder analysis assessing key stakeholders from public interest, national & international organisations, and local area commissioning bodies. Plot is of hypothetical power/interest plane with current positions plotted and aims of CHI marketing strategy denoted as arrows.
A stakeholder analysis was undertaken to assess the current power and interest of selected stakeholders (Figure 5) (Buse, Mays, and Walt 2012). The CHI is intended to give evidential support for interventions for those with position power to use, such as the CMO and public health organisations. Local commissioning will use CHI to target resources to reduce health inequalities. There is the potential for media attention to develop a negative narrative around geographical variations (eg, “postcode lottery”). The intention would be to combat negative power using expert power from CMO and central government to control the media narrative.