MATH1068 -Statistical Methods Assessment
- Subject Code :
MATH-1068
Instructions
- Thisassignmenthasthreequestions,eachassociatedwithaspecialdataAlldatafilesneededforthisassignmentareintheAssignmentTabfolder.ThefilesareinExcelformatandshould be copied directly intoMinitab.
- Thenumberofmarks foreachquestionis displayednexttothe question.
- Itisimportantthatyoufollowanyinstructionsgiveninthequestions,suchasUseMinitabwhere required. It isup to you whetherto use hints or
- RefertotheMinitabguidepostedonthelearnonlinewebsiteinDataFiles&Minitab
- Minitaboutputwillberequiredandtoavoidlosinganyinformationwhenuploaded,your
assignmentfile shouldbesubmitted as asinglePDF document file.
- Submission is online on the learnonline website via special link. Assignments will bemarkedand returned online.
- Late submission without approved extension will attract a penalty of 10% off themaximum marks per every extra day after the due date. The cut-off time is 5pm each
Question1(30Marks)DescribingDistributions
StKildaHousingSalePricesInvestigation.
Inthelastfewyears,thehousingmarkethaschangedsignificantly.Itisimportanttounderstand historical data on sale prices and obtain key insights from pre-Covid era tounderstand current trends. The data used in this investigation was collected in the St Kildasuburb of Melbourne and includes housing sale prices for 368 properties recorded betweenSeptember2016andFebruary2018.Therearetwotypesofhousinginthisdataset:house
(h) and unit (u).The main aim is to analyse and provide some insights on housing sale pricesbeforeCovid-19 pandemic.
atafile:StKildaHousing.csv
- (2marks) UseMinitabtoproduce ahistogramofthePrice($).
- (2marks)UseMinitabto produceDescriptive StatisticsforthePrice($).
- (2marks)UseMinitabtoproduceaboxplotdescribingthePrice($).Showtheboxplothorizontally(Minitabbooklet hasinstructions on howto rotateaboxplot).
- (5marks)Usingyouroutputfrom(a)to(c),commentontheshapeofthedistributionforthePrice ($). In particular,
- Discusswhetherthereisonepeak,ormultiplepeaksinthedistribution,
- Describetheshapeofthedistribution(skewedorsymmetric),
- Determineifthere arethere anyoutliers,
- Explain, characterise, and give reasons for any irregular patterns in the prices.Hint:Thedistributioncontainsdifferentsetsofpricesfordifferenttypesofhousing,and missing data that has $0 price. Use Minitab to plot the boxplot for eachhousing type (house and unit) prices and compare with your findings for all typesofhousings selling prices done earlier.
- (4marks)UseMinitabtoproduceboxplotsofeachhousetypecategorisedinyouranswerfrom(d) excluding zero-dollar(missing data) prices.Hint: Use thecondition function.
- (5 marks) Use Minitab to produce new Descriptive Statistics summaries of Price ($) foreach type of housing with missing data removed. Which measures of central tendencyand dispersion are the most appropriate to numerically summarise and compare thedifferentsellingpricesforhousesandunits?Forfullmarks,justifyyourchoiceofmeasures and interpret the corresponding values. Hint: Use the sort function and createa newsorted entire dataset.
- (4marks)UseMinitabtoplottheNormalitycurvesforeachhousingtypePrice($),brieflydescribeeachplotanddeterminewhetherthe datafollowsa Normal
- (6marks)Basedontheknowledgeobtainedinpart(g),useMinitabtocalculatetheprobabilitythattheAveragePrice($)ofaHousepropertytypewillbelessthan
$1,400,000.
Question2(30marks)NormalDistribution&CLT
Generating Exam Scores. Recently, a cohort of 100 students of a university statistics coursesat their final examination and their exam scores were recorded. The statistics lecturer isinterested in analysing the data and calculating the probabilities for a student to achieve acertainscoreandinfindinginformation regardingthe average scorefor a cohort.
Datafile:exam.csv
- (5marks)UseMinitabtoproduceahistogramandtheDescriptiveStatisticsfortheexamDescribe the shape ofthedistributionand justify your answer.
- (6marks)CalculatetheprobabilitythatastudentsexamscoreisgreaterthanForfullmarks, show all your working out by hand except a Normality test, provide a correctprobabilitystatement and include theMinitaboutputto verify your answer.
- (5 marks) If a student scores in the top 5% of the cohort, what is the minimum requiredexam score to get into the top 5% of the class? For full marks, show all your working outby hand, provide a correct probability statement, and include the Minitab output toverifyyour answer.
- (4marks)ByusingMinitab,producefiverandomsamplesofsize30byrandomlyselecting 30 values from the exam score dataset. For full marks, provide a screenshot ofeach
- (4 marks) Use Minitab to produce the Descriptive Statistics for each sample. What arethe parameters of the sampling distribution of exam scores based off your first sample?Brieflyjustify your
- (6 marks) Calculate the probability that the average (mean) exam score is less than 75based on your sampling distribution of the means from part (e). You can assume herethatthecohortof100studentsisaForfullmarks,showallyourworkingoutbyhand,provideacorrectprobabilitystatement,andincludeallMinitaboutputsupportingyouranswer.
Question3(25marks)ConfidenceIntervals
Analysing data on Fuel Efficiency: Fuel efficiency (in liters per 100 km) is one of the mostimportantparameterswelookatwhenchoosingtobuyanewcar.Acarreviewerisinterestedto find out the average fuel efficiency on highways for modern cars with engine capacity of2Landreportittothenewbuyersinhisblog.Heteststheefficiencyfor79carswhiledrivingthemonthesamehighway.Investigateandanalysethedataobtainedbythecarreviewertofind out an average fuel consumption on highways for modern cars with engine capacity of2L.
Datafile:fuel.csv
- (2marks)UseMinitabtoproduceahistogramoftheFuelConsumption(L/100km).
- (2marks)UseMinitabtoproduceDescriptiveStatisticsfortheFuelConsumption(L/100km).Identify the sample standard deviation.
- (4marks)CommentontheshapeofthedistributionandtestthedistributionforWoulditbereasonabletoconstructaconfidenceintervalforthepopulation
meanfuelconsumptionbasedontheNormaldistributionusingthisdataset?Explainyouranswer.
- (6 marks) Construct and interpret the 99% confidence interval for the population meanfuel consumption based on the sample data. All calculations should be done manuallywithoutusingMinitab,however,you canuseMinitabto visualiseandverifythe results.
- (4 marks) Repeat part (d) for the 95% confidence interval for the population mean fuel
- (2marks)Thecarmakerswebsiteclaimsthattheaverage(mean)FuelConsumptionfortheir 2L engine cars is 7.1L per 100km. Discuss whether this claim is valid using 95%confidenceintervalandthen99% confidenceWhatconclusionscan bemade?
- (5 marks) The car maker would prefer to update their claim and focus on specifying themarginoferrortobewithin75%ofthestandarddeviationand95%confidence.Whatwouldbetheapproximaterequiredsamplesizetomakethispossible?Discussthisresult.