Deadline: Hand-in by afternoon Sunday, 20 April 2024.
- University :
A university Exam Question Bank is not sponsored or endorsed by this college or university.
Deadline: Hand-in by afternoon Sunday, 20 April 2024.
Purpose: Implement the entire data science/analytics workflow. Learn to correctly apply and reason about using different machine learning techniques to solve real-world problems. Gain skills in extracting data from the web using APIs and web scraping. Build on the data wrangling, data visualization and introductory data analysis skills gained up to this point as well as problem formulation and presentation of findings. Learning outcomes 1 - 5 from the course outline
Project Requirements:
Project details:- Each student should aim to create a unique and distinctive data-problem to work on that is made original by
combinations of different data sources. The goal of the project is to perform prediction analysis.
Questions to consider in your experiments and tasks to perform once you have chosen your domain:
- Build multiple regression and kNN models and compare their outputs.
- Experiment with models using different features. Which features are most effective? Why?
- Experiment with kNN using different distance metrics and different values of k, and compare the outputs. Which
values of k are most robust for the size of your dataset and your problem domain? Are variables in your data
having different scales affecting the algorithms accuracy? How have you tried to overcome this?
- Experiment with linear, multiple linear and polynomial regression models and compare them. At what point does
a regression model become too complex and no longer captures the true relationships in the data?
- How reliable are your prediction models? What do the confidence intervals and prediction bands tell you? Could you recommend this predictive model to a client? Would you expect this model to preserve its accuracy on data beyond the range it was built on?
You may install and use any additional Python packages you wish that will help you with this project. When submitting your project, include a README file that specifies what additional python packages you have installed in order to make your project repeatable on my computer, should I need to install extra modules.