CSIP5203: Big Data Analytics Applications
- Subject Code :
CSIP5203
- University :
others Exam Question Bank is not sponsored or endorsed by this college or university.
- Country :
India
CSIP5203: Big Data Analytics Applications
Time Series Assignment (Coursework 1)
P-Number ONLY
(Word count)
Table of Contents
Exponential Smoothing methods (Single Exponential Smoothing, Holts Linear, Holt-Winters)...... 4
Abstract
Problem Description and Methodology
Activity 1
Exploratory Data Analysis
Time Series Visualization
Time Series Decomposition
Naive Method
Fit, forecast and visualization
Accuracy Metrics.
Comments
Average Historical method
Fit, forecast and visualization
Accuracy Metrics.
Comments
Activity 2
Time Series Decomposition
Simple Average Method
Fit, forecast and visualization
Accuracy Metrics.
Comments
Exponential Smoothing methods (Single Exponential Smoothing, Holts Linear, Holt- Winters)
Fit, forecast and visualization
Accuracy Metrics.
Comments
Activity 3
Time Series Stationary test and Differencing
ACF and PACF
ARIMA
Fit, forecast and visualization
Accuracy Metrics.
Comments
SARIMA
Fit, forecast and visualization
Accuracy Metrics.
Comments
Conclusion
References and Bibliography
Other Resources
Appendix (if necessary)
Faculty of Computing, Engineering & Media (CEM) Coursework Brief 2024/25
Module name: |
Big Data Analytics Applications |
|||||||
Module code: |
CSIP5203 |
|||||||
Title of the Assessment: |
Coursework 1 |
|||||||
This coursework item is: (delete as appropriate) |
Summative |
|||||||
This summative coursework will be marked anonymously: (delete as appropriate) |
Yes |
|||||||
The learning outcomes that are assessed by this coursework are: 1. Conceptual understanding of various predictive analytics techniques. 2. Demonstrate self-direction and originality in analyzing time series data using appropriate predictive methods. 3. Critically evaluate data mining and machine learning algorithms in time series problems. |
||||||||
This coursework is: (delete as appropriate) |
Individual |
|||||||
If other or mixed ... explain here: |
||||||||
You should normally receive feedback on your coursework no later than 15 University working days after the formal hand-in date, provided that you have met the submission deadline |
||||||||
If for any reason this is not forthcoming by the due date your module leader will let you know why and when it can be expected. The Associate Professor Student Experience should be informed of any issues relating to the return of marked coursework and feedback. |
Late submission of coursework policy: Late submissions will be processed in accordance with current Please check the regulations carefully to determine what late submission period is allowed for your programme. |
Tasks to be undertaken: Coursework 1(see assignment brief) |
Deliverables to be submitted for assessment: Word processed report 3000 words max. |
How the work will be marked: see marking scheme (appendix B). |
Should you need any further information or advice please email
Big Data Analytics Application Coursework 1 Report Individual Assignment
Coursework 1 is an individual piece of assessment, requiring you to analyse a real Financial Time Series within Python, using the Time Series models and algorithms covered in the module, and detailing your results, interpretations, conclusions, and recommendations in a well-structured technical report. You are provided with:
- This
- Information about how to choose your real Financial Time Series into the website: https://www.nasdaq.com/
- Template Report in Appendix
- The coursework marking grid into Appendix
Important information
To help you produce this report in a timely manner, the report is built up in three weeks. You have an opportunity to modify your work considering your own reflection
You need to produce and deliver a report with conclusions and recommendations you will complete independently. The final report needs to contain a maximum of 3000 words excluding, a table of contents, diagrams, python code and appendices (if necessary). You are provided with a template to help you build your report.
This type of assessment gives you an opportunity to improve your work over the term and reduces the stress of having to produce one piece of writing at the last minute.
Individual Data Set
You will each individually choose a unique TIME SERIES (monthly periodicity) data set personal to you. The time series data set MUST be at least 60 monthly observations.
Please, follow the instructions below to choose a VALID time series:
- Go to https://www.nasdaq.com/
- Click Market Activity
- Choose a Time Series data by clicking on the desired Stock/Currency/Crypto
- Choose Time Period select 5Y
- Click Historical Quotes and Click
- Set Frequency of data as Monthly.
- Open the CSV file. Consider the Close Price column as your Time Series. Exclude the other Save your CSV file.
IMPORTANT: You need to send the dataset (CSV file) to my email (mubeen.ghafoor@dmu.ac.uk) by 13/12/2024 for my approval with the following information in the email body: your P-number, email, name of the Time series, period (from/to).
Submission
You will need to submit a copy of your report using the Learning Zone link in the assessments section of the module shell on Learning Zone (to be made available prior to the coursework deadline
Activities and Report
The analysis you are conducting will represent the use of the Time Series techniques covered in the module in the lab sessions. You need to produce a technical, well-structured, comprehensive but concise report. The report should be structured into THREE activities:
Activity 1
- Develop a description of the problem describing the time series methods that will be applied, the forecast metrics and the forecast forward time (I recommend between 6 and 12 months).
- Describe your Time Series and make appropriate use of exploratory visualization techniques and descriptive statistics to present and explore the data.
- Decompose the time series in level, trend, seasonal and residual. Evaluate the additive and multiplicative effects.
- Proceed with a forecast for the chosen period considering the Naive method and Average Historical method.
- Compare the predictive performance of both methods using suitable accuracy
- Interpret the results.
Activity 2
- Decompose the time series in level, trend, seasonal and residual. Evaluate the additive and multiplicative effects.
- Proceed with a forecast for the chosen period considering the Simple Average Method and the 3 variants of the Exponential Smoothing methods (automatically optimize the values of the parameters).
- Compare the predictive performance of both methods using suitable accuracy
- Interpret the
Activity 3
- Proceed with a Dickey-Fuller test to check if the series is stationary. Differencing if necessary, until becomes stationary.
- Built the ACF and PACF plots and chose adequate values for AR term (p) and MA term (q). Find the best ARIMA model using the AIC criterion.
- Proceed with the model summary provides and analyses the model
- Proceed with a forecast for the chosen period considering the
- Repeat steps (a)-(d) to the SARIMA
- Compare the predictive performance of both methods using suitable accuracy
- Overall conclusions
Develop and integrate ALL the activities into a full technical report. Include the Python code used to create the outputs (plots, time series models, tables, etc).
Notes
Report Guidance:
Your contribution to the report should be no longer than 3000 words and use a minimum font size of 12. You are given a report template, with the first steps of the work and a recommended structure, and a table of contents. You are free to modify the layout to suit your own style. The aim of the template is to provide you guidance as to the level of presentation that is expected in a technical report.
It's important to include a brief description of models, appropriate justification for some models and model limitations, data insights, and analysis supported by appropriate tables, metrics and charts. Reports are expected to be written to a professional standard, clear and concise. Text supported by a relevant choice of plots and tables to summarise data, avoidance of repetition and redundancy, appropriate use of appendices, table of contents and use of page numbers, table numbering and figure numbering, and presence of an informative abstract. All plots, code and outputs must be legible and appropriately labelled. You will be penalized for inappropriate use of techniques that you cannot adequately justify or explain.