CC5067 Smart Data Discovery Assignment
- Subject Code :
CC5067
- University :
London Metropolitan University Exam Question Bank is not sponsored or endorsed by this college or university.
- Country :
Australia
Part 1 [40 marks total]
This task will test your ability to find relevant data, cleanse & transform the data and build dimensional database or any other Data Mining techniques to analyse data. Using what you have learned in this module, apply the Data Analysis Lifecycle steps to complete the tasks below. It is expected that students use the software taught in this module, but it is optional. If students decide to use other tools, then they need to prepare a ‘Walkthrough’ or ‘Readme’ to explain how the solution works.
Old cities like London, have narrow streets; while they are not even enough to tackle cars, busses and pedestrians, the usage of scooters, bicycles… etc. has increased. The purpose of this coursework is to find if this is affecting the road safety. Make sure to section your coursework methodology according to Data Analysis Lifecycle.
- Find data relevant to Road Safety and describe it. Hint: search for Road Safety Data from data.gov.uk and download casualties, vehicles and accidents [5 marks]
- Import the data to your preferred environment and complete the below:
- Apply data profiling, check for errors, inconsistencies and correct them accordingly. [15 marks]
- Design a model to answer the questions in part 2 [20 marks]
Part 2 [20 Marks total]
Answer the below questions using one of the methods learned in this module:
- Which accident severity type occurs the most? Do we know what it is labelled as? [5 marks]
- Is accident severity affected by vehicle type? If yes, which vehicle type? If no, why? [5 marks]
- Which gender is involved more in accidents? What is the top 5 most used vehicle type by that gender? Is the ‘most used vehicles’ the same for the other genders? [5 marks]
- List 5 more questions to answer and write down the answer. [5 marks]
Part 3 [30 Marks total]
Add below to your report with captions and explanations on where they belong in Part 2.
- Share all the visualisations created/code written to answer the questions in Part 2. [10 marks]
- Download the embedded csv files from here( ) and answer the these questions:
- If you had these in Part 1 and Part 2, would you change your design in Part 1 B.?
- Would you answer any questions from Part 2 differently? Why? [5 marks]
- Given these 2015 datasets( ),prove or disprove ‘White-Van Man’ (https://en.wikipedia.org/wiki/White_van_man) stereotype? [15 marks]
Report Structure [10 Marks total]
You are required to write academic report, make sure you follow the below structure:
- Cover Page
- Table of Contents
- Introduction
- Methodology and Design (Part 1)
- Findings (Answers to Part 2 and 3)