diff_months: 1

Exploratory Data Analysis and Statistical Inference: Three Study Questions Addressed Using Stata

Flat 50% Off Order New Solution
Added on: 2024-04-16 08:21:25
Order Code: CLT318750
Question Task Id: 0
  • Subject Code :

    HSH746-HSH946

Question 1

 The Framingham heart study is a longitudinal prospective study of the cause or origin of cardiovascular disease among a population of individuals in the community of Framingham, Massachusetts, USA. The primary objective of the study was to identify the common factors/characteristics that contribute to cardiovascular disease in the Framingham, Massachusetts general population. Participants were followed over a long period to monitor their cardiovascular health. To sample the population, they considered certain key characteristics relevant to cardiovascular health (e.g. age and sex) and the fact that each member had a chance of being chosen into the study.(Total Marks:5)

  1. What is the target population of interest? (0.5 marks)
  2. What sampling method(s) would be most appropriate and why (2.5 marks)
  3. If the most appropriate sampling method is used in (b) above, would you consider the study’s findings to be generalisable to the general population of Framingham, Massachusetts? Give a reason why. (1 mark)
  4. Would you consider the study’s findings to be generalisable to the general population of the USA? Give a reason why. (1 mark)

Question2

Read the following data description and answer the following questions. A study collected data on GP visits for 500 adults, aged 45 years and above. The data for this study can be found in the data set AT1_GPvisits data.(Total Marks: 5)

The variables in the data set include the following:

Variable

Description

Units

Range or count

id

Respondentindividual id

 

1-500

sex

Sex of respondent

1 = Male

2 = Female

n = 202

n = 297

age

Respondent age

Years

45 - 79

older

Respondent 65 years and older

1 = yes

0 = no

n = 344

n = 155

GPvisit

Respondent visited GP

1 = yes

0 = no

n = 60

n = 440

NGP_visits6m

Respondentnumber of GP visits in the past 6 months

counts

0 -14

The data is synthetic data, you may reference them in your answers as coming from assignment 1 GP_visitsstudy.

  1. In this question, we will focus on an exploratory analysis of the data. Check all individual variables and associated variables for any invalid and/or inconsistent values and take appropriate action.Clearly explain each step. (3.5 marks)
  2. Older peoplehave been reported to visit the GP more frequently than younger people. Indicate whether this is true for our sample and use statistics to support your answer.Hint use the GPvisits variable and report stats to 1 decimal. (1.5 mark)

Question 3

A study in Australia collected weight for 300 full term newborn babies (grams). The data for this study can be found in the data set AT1_newborn_weight data.(Total Marks: 10)

  1. In Stata, using the drop-down menu, create a histogram of newborn weight(grams). Adjust the binwidthto 200, suggest # ticks = 5 for major ticks, suggest # between major ticks = 5 for minor ticksand include height labels. Give the graph an appropriate title and footnote. (2.5marks)
  2. Is the distribution of newborn weight symmetric or not? Give a reason why.Report statistics to 1 decimal. (1.5 mark)
  3. Using the histogram, what is the probability that a newborn chosen at random from this sample will have a newborn weight greater than 3800g?Report statistics to 2 decimals(1 mark)
  4. There is evidence of a strong association of birth weight with infant mortality, with birthweight shown to be a determinant of infant survival. Suppose you are now interested to categorise the newborns into different weight groups based on their birth weight, using the following criteria.
    Low birth weight:birth weight <2500g>Normal birth weight: birth weight between 2500g and 4000g
    High birth weight: birthweight >4000g
    Generate a new variable (bweight_group) based on the criteria above. (Hint: generate bweight_group=. then replace the variable using the criteria above). (1 mark)
    Add value labels as follow:
    bweight_group= 1 for Low birth weight
    bweight_group= 2 for Normal birth weight
    bweight_group= 3 for High birth weight
  5. Add value labels, and tabulate bweight_group (1 mark)
  6. What percentage of newborn babies were classified as low birth weight (report to 1 decimal)?(0.5 marks)
  7. How does the percentage of newborns classified as low birth weight compare to newbornsclassified as high birth weight?(1 mark)
  8. What are the types of variables for examples illustrated above concerning newborn weight? (1 mark)
  9. Which variable type is more informative? (0.5 marks)

Are you struggling to keep up with the demands of your academic journey? Don't worry, we've got your back!
Exam Question Bank is your trusted partner in achieving academic excellence for all kind of technical and non-technical subjects. Our comprehensive range of academic services is designed to cater to students at every level. Whether you're a high school student, a college undergraduate, or pursuing advanced studies, we have the expertise and resources to support you.

To connect with expert and ask your query click here Exam Question Bank

  • Uploaded By : Mohit
  • Posted on : April 16th, 2024
  • Downloads : 0
  • Views : 18

Order New Solution

Can't find what you're looking for?

Whatsapp Tap to ChatGet instant assistance

Please Pay the Amount