Big Data Small Project using R Language Research Project
- Country :
Australia
Big Data Small Project using R Language
The work will combine research on a specific problem/technique related to Big Data Processing, implementation of a computing solution and presentation of the results. There is an initial set of proposed topics. However, it is allowed (and encouraged) to propose alternative small projects, or variants, which will have to be discussed with the lecturers. The expected work will include performing research on the subject topic, selecting and implementing a computing solution based on technologies covered in the lectures such as Map/Reduce, Mongo DB and data analytic technique, and obtain the desired results from the processed data. While the topic provides some general guidelines on what the coursework will consist of. It is expected that you will take these guidelines, and suggest a specific proposal of what are they aiming to achieve in the project.
Music Recommendation System
Music recommendation systems are becoming a hot topic these days due to increase in number of online listeners to systems like Spotify. Recommending users with relevant songs and predicting which songs will be liked by a particular user is always a very good feature for any music application. You are to developing a music recommendation system based on the Million Song Dataset.
Dataset: http://labrosa.ee.columbia.edu/millionsong/pages/getting-dataset
Predict short term movements in stock prices
The basic assumption is that the stock price largely depends on both inside and outside factors, where inside factor include company performance (earnings and profits), company news (introducing new products, securing a new large contract, etc), and outside factor such as industry performance, investor sentiment (bull market or bear market, news sentiments), economic environment (interest rates, economic outlook and inflation, etc).
Data set: https://www.kaggle.com/c/battlefin-s-big-data-combine-forecasting-challenge
Twitter to predict the next best restaurant
Yelp has a data set that include restaurant rankings and reviews. One idea for this project is to use tweets to predict restaurant star ratings. This would enable you combine Yelp data with twitter data.
Dataset: https://www.yelp.com/academic_dataset
- Have you provided a context for the project? Have you provided a description of the data?
- have you loaded the data? Have you explored/processed the data? have you provided script(s) for pre-analysis?
- Have you identified the objective of the analysis and the technique to be used?
- How are presenting the result of the analysis?