# MATH 1068 -Statistical Methods Assessment

MATH-1068

Instructions

• Thisassignment has three questions, each associated with a special data  All data files needed for this assignment are in the Assignment Tab folder. The files are in Excel format and should be copied directly into Minitab.
• Thenumber of marks for each question is displayed next to the question.
• Itis important that you follow any instructions given in the questions, such as “Use Minitab” where required. It is up to you whether to use hints or
• Referto the Minitab guide posted on the learnonline website in Data Files & Minitab
• Minitaboutput will be required and to avoid losing any information when uploaded, your

assignment file should be submitted as a single PDF document file.

• Submission is online on the learnonline website via special link. Assignments will bemarked and returned online.
• Late submission without approved extension will attract a penalty of 10% off themaximum marks per every extra day after the due date. The cut-off time is 5pm each

Question 1 (30 Marks) – Describing Distributions

St Kilda Housing Sale Prices Investigation.

In the last few years, the housing market has changed significantly. It is important to understand historical data on sale prices and obtain key insights from pre-Covid era to understand current trends. The data used in this investigation was collected in the St Kilda suburb of Melbourne and includes housing sale prices for 368 properties recorded between September 2016 and February 2018. There are two types of housing in this data set: house

(h) and unit (u). The main aim is to analyse and provide some insights on housing sale prices before Covid-19 pandemic.

ata file: St Kilda Housing.csv

• (2marks) Use Minitab to produce a histogram of the Price (\$).
• (2marks) Use Minitab to produce Descriptive Statistics for the Price (\$).
• (2marks) Use Minitab to produce a boxplot describing the Price (\$). Show the boxplot horizontally (Minitab booklet has instructions on how to rotate a boxplot).
• (5marks) Using your output from (a) to (c), comment on the shape of the distribution for the Price (\$). In particular,
• Discusswhether there is one peak, or multiple peaks in the distribution,
• Describethe shape of the distribution (skewed or symmetric),
• Determineif there are there any outliers,
• Explain, characterise, and give reasons for any irregular patterns in the prices.Hint: The distribution contains different sets of prices for different types of housing, and missing data that has \$0 price. Use Minitab to plot the boxplot for each housing type (house and unit) prices and compare with your findings for all types of housings selling prices done earlier.
• (4marks) Use Minitab to produce boxplots of each house type categorised in your answer from (d) excluding zero-dollar (missing data) prices. Hint: Use the condition function.
• (5 marks) Use Minitab to produce new Descriptive Statistics summaries of Price (\$) foreach type of housing with missing data removed. Which measures of central tendency and dispersion are the most appropriate to numerically summarise and compare the different selling prices for houses and units? For full marks, justify your choice of measures and interpret the corresponding values. Hint: Use the sort function and create a new sorted entire dataset.
• (4marks) Use Minitab to plot the Normality curves for each housing type Price (\$), briefly describe each plot and determine whether the data follows a Normal
• (6marks) Based on the knowledge obtained in part (g), use Minitab to calculate the probability that the Average Price (\$) of a House property type will be less than

\$1,400,000.

Question 2 (30 marks) – Normal Distribution & CLT

Generating Exam Scores. Recently, a cohort of 100 students of a university statistics course sat their final examination and their exam scores were recorded. The statistics lecturer is interested in analysing the data and calculating the probabilities for a student to achieve a certain score and in finding information regarding the average score for a cohort.

Data file: exam.csv

• (5marks) Use Minitab to produce a histogram and the Descriptive Statistics for the exam  Describe the shape of the distribution and justify your answer.
• (6marks) Calculate the probability that a student’s exam score is greater than  For full marks, show all your working out by hand except a Normality test, provide a correct probability statement and include the Minitab output to verify your answer.
• (5 marks) If a student scores in the top 5% of the cohort, what is the minimum requiredexam score to get into the top 5% of the class? For full marks, show all your working out by hand, provide a correct probability statement, and include the Minitab output to verify your answer.
• (4marks) By using Minitab, produce five random samples of size 30 by randomly selecting 30 values from the exam score dataset. For full marks, provide a screenshot of each
• (4 marks) Use Minitab to produce the Descriptive Statistics for each sample. What arethe parameters of the sampling distribution of exam scores based off your first sample? Briefly justify your
• (6 marks) Calculate the probability that the average (mean) exam score is less than 75based on your sampling distribution of the means from part (e). You can assume here that the cohort of 100 students is a  For full marks, show all your working out by hand, provide a correct probability statement, and include all Minitab output supporting your answer.

Question 3 (25 marks) – Confidence Intervals

Analysing data on Fuel Efficiency: Fuel efficiency (in liters per 100 km) is one of the most important parameters we look at when choosing to buy a new car. A car reviewer is interested to find out the average fuel efficiency on highways for modern cars with engine capacity of 2L and report it to the new buyers in his blog. He tests the efficiency for 79 cars while driving them on the same highway. Investigate and analyse the data obtained by the car reviewer to find out an average fuel consumption on highways for modern cars with engine capacity of 2L.

Data file: fuel.csv

• (2marks) Use Minitab to produce a histogram of the Fuel Consumption (L/100km).
• (2marks) Use Minitab to produce Descriptive Statistics for the Fuel Consumption (L/100km). Identify the sample standard deviation.
• (4marks) Comment on the shape of the distribution and test the distribution for  Would it be reasonable to construct a confidence interval for the population

mean fuel consumption based on the Normal distribution using this dataset? Explain your answer.

• (6 marks) Construct and interpret the 99% confidence interval for the population meanfuel consumption based on the sample data. All calculations should be done manually without using Minitab, however, you can use Minitab to visualise and verify the results.
• (4 marks) Repeat part (d) for the 95% confidence interval for the population mean fuel
• (2marks) The car maker’s website claims that the average (mean) Fuel Consumption for their 2L engine cars is 7.1L per 100km. Discuss whether this claim is valid using 95% confidence interval and then 99% confidence  What conclusions can be made?
• (5 marks) The car maker would prefer to update their claim and focus on specifying themargin of error to be within 75% of the standard deviation and 95% confidence. What would be the approximate required sample size to make this possible? Discuss this result.
