Summative Assessment
0179070
Summative Assessment
DO NOT WRITE YOUR NAME ON YOUR WORK. Instead, please write your student number on this coversheet.
Student No.
Date Submitted: Word Count: PLAGIARSIM DECLARATION:
This assignment is entirely my own work, and it adheres to the University of Bristols policy on plagiarism and academic integrity.
Quotations from secondary literature are indicated by the use of inverted commas around ALL such quotations AND by citations in the text or notes to the author concerned. ALL primary and secondary literature used in this piece of work is indicated in the bibliography placed at the end, and dependence upon ANY source used is indicated at the appropriate point in the text.
I confirm that no sources have been used other than those stated.
I confirm that I have not used artificial intelligence or chatbot software to create any writing or content in this assessment
I confirm that I have not written this work in another language and translated it into English using translation tools
I confirm that I have not used grammar checkers that suggest rewrites
I understand that plagiarism, collusion, and cheating constitute misconduct and may result in disciplinary action being taken.
Please type your answer within this document directly below each question
Question 1: The Cystic Fibrosis Foundation (CFF) in the US reports annual data on people being treated in the US for cystic fibrosis a genetic disease usually diagnosed in childhood. Routine screening at birth (referred to as NBS: newborn screening) has reduced the time to diagnosis since its adoption. A recent annual report from the CFF presented the two following graphs.
Based on these two graphs, describe the changes in incidence and prevalence in cystic fibrosis between 2001 and 2021 in the UK. (2pts)
Answer:
Question 2: A team of researchers sought to study the health of firefighters in the US and determine how their health compared to the general population. They undertook a prospective cohort study of active firefighters in 10 urban areas and recruited random samples of adults from the general population of those areas. Adults in both groups were followed up for 10 years to ascertain health outcomes. After performing their analyses, the researchers were surprised to see that the incidence of heart disease was much lower in firefighters than the in the general population. Is there something protective about fighting fires? How might these results have arisen? (2pts)
Answer:
Question 3: A group of researchers at the University of Bristol are interested in understanding the relationship between menopausal hormone therapy and dementia in women over the age of 50. The researchers plan on using a national database of primary care data to identify women with dementia in this age group and are confident that they will be able to identify menopausal hormone therapy prescriptions from this database. They have asked you for advice on who their controls should be and where could they find them.
Make a proposal to the team indicating who the controls should be and where they can be identified from (2pts). If you think there are important confounders to measure, provide details (1pt) and explain why accounting for this is important (1pt).
Answer:
Question 4: It is generally recommended that pregnant women experiencing stillbirth do not have a caesarean section as it exposes the mother to the risk of surgery without benefitting the child. A team of African researchers sought to understand how common caesarean sections are in women experiencing stillbirth using a cross-sectional study of 17138 women who gave birth between 2007 and 2017. They found that the proportion of caesarean sections in women experiencing stillbirth or very early neonatal deaths was 19% (86/447) compared to 10% in women who experienced live births (1602/16691).
Calculate and interpret the OR (3pts)
Answer:
They also sought to determine whether birth outcome (stillbirth/very early neonatal death vs live birth) modified the association between delivery (caesarean section or vaginal birth) and household wealth of the mother. They presented the following results:
Stillbirths and very early neonatal deaths (N=447) Live births who survived the first day (N=16,691) Household wealth index Caesarean section (N=86) Vaginal birth (N=361) OR 95% CI Caesarean section (N=1602) Vaginal birth (N=15089) OR 95% CI Interaction p-value
Poor 20 157 1 340 6813 1 0.33
Middle 22 79 2.2 (1.0 to 4.8) 261 3082 1.7 (1.4 to 2.1) Rich 45 125 2.9 (1.4 to 5.9) 1001 5194 3.9 3.9 (3.3 to 4.6) Interpret the results. Reflect on the results overall and on the ORs and the 95% CIs (5pts)
Answer:
Based on the results presented above, does the relationship between household wealth and method of delivery differ by birth outcome? Justify your answer (2pts)
Answer:
Question 5: A team of researchers sought to explore the long-term effects of exposure to chemical warfare agents in military veterans. They identified participants for the study by examining historic records of military veterans. They identified one group of veterans who had attended a programme at Porton Down (UK) where many were exposed to low doses of chemical warfare agents. They then identified a comparable group of veterans who had not attended the programme at Porton Down. They then used civil registration data to determine if and when individuals died as well as the cause of death. One of the outcomes of interest is all-cause mortality and they presented the following graph:
What kind of graph is this? (1pt)
Answer:
What does it tell us? (2pts)
Answer:
What kind of test would you perform to compare the two curves in this graph? (1pt)
Answer :
Question 6: A team of researchers sought to evaluate a new screening method for identifying pancreatic cancer. This new tomography-based method was compared with the reference standard. The following results were obtained for the new method:
Sensitivity: 46%
Specificity: 98%
Positive predictive value: 70%
Negative predictive value: 96%
Interpret each of the 4 statistics below in the context of this study (4pts)
Answer:
Question 7: Monkeypox (mpox) is an infectious disease that is caused by the mpox virus. Symptoms include painful rash, enlarged lymph nodes and fever. Those infected with mpox can spread it to others through touch, kissing, sex and mothers can pass it on to their unborn baby. There is also evidence it can be spread through using bedsheets, clothes or needles contaminated with mpox or hunting/skinning/cooking infected animals.
A team of researchers sought to identify risk factors and clinical features of mpox in a large urban centre which was experiencing a sudden and alarmingly high influx of cases. They decided to conduct a case-control study based on those individuals seeking mpox testing at a large teaching hospital to understand risk factors for infection. Cases were the first 70 people who tested positive for mpox and the first 70 people who tested negative for mpox were selected as the controls.
Comment on the advantages and disadvantages of their case/control selection. (4pts; 1pt for each advantage/disadvantage).
Answer:
Question 8: A group of researchers decided to study the effectiveness of an alternative anti-obesity drug requiring weekly injections which went on the market in 2020. This drug is limited to adults with type 2 diabetes who have a BMI greater than 25. Using routinely collected data from 2020 to now, they proposed to undertake a retrospective cohort study as they can identify exposure to this drug (and the dose given) and patients BMI as recorded on their GP records. The researchers have asked you for advice on designing the study.
Propose inclusion criteria for the study and justify your choice. (2pts)
Answer:
This retrospective cohort study will be much cheaper to conduct than the randomised controlled trial described above. It will, however, have limitations that the trial will not have. What do you think are the two biggest limitations of the retrospective cohort study? Justify your answer (4pts)
Answer:
Question 9: A clinical trial was performed to examine whether the early use of a diabetes treatment in women with gestational diabetes improved maternal outcomes. Mothers were randomised to early use of the diabetes treatment or placebo. They found that of the 264 women receiving the diabetes treatment, 101 needed insulin later in the pregnancy. This compared with 134 of the 262 women in the placebo group. Based on these numbers, calculate the risk ratio and absolute risk differences and interpret the results (6pts)
Answer:
Question 10: Malaria and HIV account for significant morbidity and mortality in some African countries. A survey was conducted in 2018 across outpatient department and emergency units Cameroon collecting data on patients vital signs and analysing blood samples. The researchers found that 33% of all patients tested positive for malaria parasites and of those who tested positive for malaria, 7% were also positive for HIV. 38% of patients had either malaria or HIV.
a) What kind of study design is this (1pt)? Justify your answer (1pt).
Answer:
b) What proportion had both? Show your workings (2pts; 1pt for the correct answer and 1pt for the calculations) and explain why you took this approach (2pts).
Answer:
Question 11: The Our World in Data website collates data from various national and international databases to allow researchers to study various health outcomes and how they vary between countries. The graph below uses national-level data from countries around the world to present child mortality rates in terms of GDP per capita.
What kind of study is this? Justify your answer (2pts)
Answer:
Describe the results. (1pt)
Answer:
Propose two forms of analysis that could be used to study this data. What information would each analysis give us (4pts)
Answer:
Question 12: Using police data, a team of researchers conducted an analysis of injuries experienced by people while walking. They classified the injuries as either fatal or non-fatal. They considered a range of exposures they thought might be associated with an injury being fatal. Of particular interest were the lighting conditions at the time of the injury and they presented the following results.
Light conditions All injuries Fatal injuries Non-fatal injuries
Daylight 28521 380 28141
Darkness-lit 8543 212 8331
Darkness-unlit 6634 793 5841
Unknown 790 24 766
Based on these data, calculate the RRs for fatal injuries by light conditions (6pts) and interpret the results (3pt). Justify your choice of reference group (1pt).
Answer:
Question 13: Imagine yourself in the year 2030 sitting on the funding panel of a wealthy research organisation. Following a call for research to understand the impact of a sedentary lifestyle on dementia risk you received 4 applications. Details of the applications are outlined below:
Application 1: An international ecological study using national-level data from 161 countries participating in the Global Health Observatory data repository. Archived patient-level national survey data will be used to derive each countrys level of physical activity. This is expressed as the proportion of adults in that country not meeting the World Health Organisation recommendations for physical activity. This data will be correlated with national estimates of dementia prevalence using data from 2023 to 2027 depending on country. Cost: 150,000. Duration: 12 months
Application 2: A large national cross-sectional survey of UK care-home residents in 2032. Participants will complete a questionnaire asking about their current levels of physical activity and diagnoses of dementia and other dementia-like illnesses. Cost: 300,000. Duration: 24 months
Application 3: A case-control study evaluating the association between dementia as reported in primary care and self-reported activity levels. Researchers will identify cases of dementia from GP records. Controls will be adults without dementia of the same age, gender and GP practice. Cases and controls will then be contacted to enquire about their physical activity levels now and over the previous 10 years. Cost: 500,000. Duration: 2 years.
Application 4: A prospective cohort study of adults aged 40-45 years attending 40+ checkups with their GP. Participants will be approached at their clinic where they will be asked to wear a pedometer (a wearable device monitoring physical activity) for 1 week every 6 months. Participants will be followed up for 25 years. Diagnoses of dementia will be captured through GP records. Cost: 10,000,000. Duration: 30 years.
In your review of the applications you are asked to list strengths and limitations of each study. For each study, outline one strength and 1 limitation for each study. (2pts for application 1; 2pts for application 2; 2pts for application 3; 2pts for application 4)
Strengths Limitations
Application 1: Ecological study Application 2: Cross-sectional study Application 3: Case-control study Application 4: Cohort study The research organisation would prefer to fund only one application. Of the four applications, which would you recommend for funding based on your epidemiological training? Justify your answer. (2pts) Would you require any amendments? If so, outline them (1pt)
Answer
Question 14: Ice hockey players are exposed to frequent head trauma despite the use of protective equipment. A team of researchers sought to explore the long-term effects of repetitive brain injury in professional hockey players.
The team explored historical records from the National Hockey League (NHL) identifying all players who played at least one game between 1967 and 2022. For each player, the exposure of interest was the number of fights the player was involved in over their career and the outcome was mortality as obtained from civil registration data. All players were followed up from their first match until 2023 or death.
The researchers presented the following graph showing the total number of fights players were involved in over their career on the horizontal axis and the number of players on the vertical axis. The graph set in the box below shows the data from 71 -261 which is hard to read on the larger axis.
What kind of graph is this? (1pt)
Answer:
Describe the shape of the distribution. (1pt)
Answer:
How might you report this data numerically? Justify your answer. If more than one approach is sensible, describe and justify each. (5pts)
Answer:
What kind of study design is this? (1pt) Justify your answer. (1pt)
Answer:
Propose one way you could analyse the mortality outcome (1pt). Justify your answer. (1pt)
Answer:
Question 15: Read the following abstract from a research paper and then answer the questions below.
Background: There is increasing concern regarding the potential impact of social media use on the mental health of young people. Previous research has relied heavily on retrospective accounts of social media screen-time. Yet recent evidence suggests that such self-report measures are unreliable, correlating poorly with more objective measures of social media use. In principle, time use diaries provide a less biased measure of social media use.Methods: We analysed cross-sectional data from the sixth sweep of the Millennium Cohort Study to explore associations between social media screen-time as recorded in time use diaries (TUD) and key mental health outcomes self-harm in the past year, depressive symptoms (Short Mood and Feelings Questionnaire), self-esteem (shortened Rosenberg scale) in adolescence. TUDs assessed activities (including social media use) during two 24 hour periods; a randomly selected weekday and a weekend day. Social media TUD data were available for 4,032 participants aged 13-15 years.Results: Following adjustment for confounders, a greater amount of time spent on social media was associated with an increased risk of self-harm (adjusted OR per 30-minute increase in weekday use: 1.13, 95% CI 1.06 to 1.21) and depression (adjusted OR=1.12, 95%CI 1.07 to 1.17) and lower levels of self-esteem (adjusted B=-0.12, 95%CI -0.20 to -0.04) in females. Findings were similar for weekday and weekend use.Conclusions: Future research should examine the direction of the associations with self-harm and other mental health outcomes and explore how adolescents engage with social media as well as how much time they spend online.
a) What type of study design is this? (1pt)
Answer:
b) The millennium cohort study began with an original sample of 18,818 cohort members. Time use diary data were available for 4,032 participants at this time point. What type of bias could this lead to and why? What impact does this bias have? (3pts)
Answer:
c) Could the time use diary data be affected by measurement error? Explain your answer (2 pts)
Answer:
d) The analyses conducted here were cross sectional. What is an advantage of this approach? (1pt)
Answer:
e) What is the main limitation of this cross sectional approach? (1pt)
Answer:
f) How could you design the research differently to address this limitation (1 pt)
Answer:
g) Suggest two confounders that might be important to adjust for? (2pts)
Answer:
h) The authors conducted a subgroup analysis to look for gender differences. What other analysis should they have done to investigate this? (1pt)
Answer:
Question 16: In 2004 the International Agency for Research on Cancer concluded that there was sufficient evidence in humans that tobacco smoking causes cancer of the nasopharynx (NPC). The magnitude and patterns of associations between smoking and NPC in high-incidence regions like China was uncertain so a team of researchers conducted a population-based case-control study in southern China.
Eligible subjects are individuals between the ages of 20 and 74 who were living in the area at the time of NPC diagnosis with no history of cancer. Cases were those with confirmed incident diagnoses of NPC. Timely identification of cases was made possible using a rapid case ascertainment system using 10 hospitals and 2 cancer research units in the region. Controls were randomly selected using population registries covering the study area. Cases and controls were matched for age, sex and geographic location.
Data on smoking histories was collected by trained interviewers who administered a questionnaire. The interviewers were not blinded to the case-control status, but they were trained to interact in the same way with cases and controls. Ever smoking tobacco was defined as having smoked at least 1 cigarette every 1-3 days for at least 6 months. Among cases, current smokers were defined as those who had smoked within the last 3 years, and former smokers were those who had quit at least 3 years before diagnosis. Among controls, current smokers were defined as those who had smoked within the last year and former smokers are those who smoked between 1 and 5 years ago
The researchers indicated that the interviewers who collected the smoking history data were not blind to case-control status. Why might blinding have been desirable? (1pt)
Answer:
What kind of variable is smoking status? (1pt)
Answer:
The researchers classified any smoking with the last 3 years of NPC as current. Why do you think such a window was important? Why not just look at smoking status on the day of diagnosis? (1pt)
Answer:
Do you think that the authors chose well in identifying their controls? Justify your answer and make reference to the matching criteria (3pts).
Answer:
The association between NPC and smoking was studied using logistic regression. The results are presented below:
Age- and Area-Adjusted Multivariable-adjusted *
Smoking status Number of NPC cases Number of controls OR 95% CI OR 95% CI
Never smoker 462 544 1.00 1.00 Former smoker 179 242 0.92 0.73, 1.17 0.92 0.72. 1.18
Current smoker 1216 1121 1.32 1.14, 1.53 1.32 1.15, 1.57
* Adjusted for age, geographic area, educational level, current housing type, current occupation, first-degree family history of NPC, tea drinking and consumption of salt-preserved fish
What is the reference group for smoking status? (1pt)
Answer:
Interpret the results of the age- and area-adjusted model above (4pts).
Answer:
What has been the effect of additionally adjusting for educational level, current housing type, current occupation, first-degree family history of NPC, tea drinking and consumption of salt-preserved fish on the effect of smoking? What does that imply? (1pt)
Answer:
Question 17: A survey of students aged 13-18 years was conducted in Northern Ireland in 2016 to explore the relationship between mental well-being, religion and family activities. Students were asked to complete questionnaires asking about:
Mental well-being: measured using a validated scale from which a numeric score is derived with higher values indicating better quality of life. Histograms of the data suggest that the data are normally distributed
Age: analysed as a numeric variable
Sex: collected and analysed as a binary variable
Ethnicity: collapsed into two groups white European and other
School type: these were categorised as grammar and non-grammar schools
Family affluence: measured using a validated scale from which a numeric score is derived with higher values indicating higher levels of affluence
Religion: classified as Protestant, Catholic, atheist or other
What kind of study is this? Justify your answer (2pts)
Answer:
The researchers conducted a linear regression analysis using the mental well-being score as the outcome and the variables presented in the table below as independent variables in a multivariable linear regression.
Adjusted Beta 95% CI
Age -0.82 (-1.23 to -0.42)**
Sex
Female (reference=male) -5.21 (-6.17 to -4.25)**
Ethnicity
Other (reference=white European) -0.64 (-2.66 to 1.39)
School type
Non-grammar (reference=grammar) 0.00 (-1.37 to 1.38)
Family affluence 0.76 (0.48 to 1.03)**
Locale of residence
Rural (reference=urban) 0.99 (-0.31 to 2.29)
Religion (reference=Catholic)
Protestant
Other
Atheist -0.82
-2.95
-2.98 (-2.23 to 0.59)
(-6.18 to 0.28)
(-4.57 to -1.40) **
*p-value<0.05; **p-value<0.001
b) Why was a linear regression model used? (1pt)
Answer:
Interpret the results of the regression model commenting on the beta coefficients and 95% confidence intervals (9pts in total; 0.5pt for each beta coefficient and 0.5pt for each confidence interval).
Answer:
d) Comment on how p-values are presented in this table. Do you think this is sufficient? (1pt) Justify your answer (2pts)
Answer:
Question 18: Air pollution is an international problem and is one of the major environmental determinants of health. Studies suggest that exposure to air pollutants such as nitrogen dioxide (NO2) and Particulate Matter (PM) is associated with a range of adverse health outcomes including cardiovascular disease and poor lung health. On average, air pollution reduces the life expectancy of every resident in the United Kingdom (UK) by 78 months. However, there is substantial variation in pollution levels between regions.
Following your MSc in Public Health/Epidemiology/Digital Health you get a job at Public Health Wales (Wales is a country in the UK). Your manager has asked you to design a study to investigate the relationship between deprivation and air pollution levels across the different regions in Wales. The regions of interest are 1909 neighbourhood areas, which have an average of 1600 residents and 650 households.
You have been asked to give a presentation on your findings in 3 months time. You do not have any funding to conduct this research but have been given access to the following databases, which contain information for each neighbourhood area.
Data from the UK Government's Pollution Climate Mapping model. This model generates validated annual estimates of area-level pollutant concentrations. Mean levels of NO2 and PM were available for the year 2011
Deprivation data from Welsh Government's Welsh Index of Multiple Deprivation (WIMD). This includes an income deprivation score which is a composite measure reflecting the proportion of all residents of a neighbourhood area with income below a defined level. Quintiles were then derived by ranking income-deprivation composite scores for all neighbourhoods and dividing the data into five roughly equal parts
Write up your proposal using the following format
Aim of the study [max 40 words, including spaces] 1pt Study design used and reason for choice [max 50 words, including spaces] 2 pts Exposure [max 150 words, including spaces]
Specify the main exposure. Explain how this data will be collected and how it is calculated. 3pts Outcome [max 150 words, including spaces]
Specify the primary outcome and outline how this data will be collected 3pts Air pollution and deprivation are also related to mortality. What data would you need to request access to in order to explore relationships with mortality? 1pt Statistical methods [max 150 words, including
spaces] Briefly outline the statistical methods you will use to analyse the data collected and what statistics you will present in your final report 4pts Strengths and limitations of this study design, 5
in total [max 100 words, including spaces] 5pts