LD7084 Database and Analytics Principles
- Subject Code :
LD7084
- Scenario
Newbury Business School (NBS) provides a new blended course to improve students English skills in speaking, reading and writing. This course uses interactive multimedia activities and targets people interested in learning English, German, French and Spanish languages around the world. NBS has 10 campuses around the world including London, Hongkong, Paris, Berlin, Rome, Dubai, Copenhagen, Oslo, Ankara and Cairo.
Since the college intends to deliver all classes in blended model (i.e. online and in campus), hence they are looking to develop an effective Relational Database Management System (RDBMS) and Data Warehouse to cater the needs of their growing educational business.
As a data analyst, you have been asked to develop a DB system for NBS, which should satisfy the following information requirements:
- The database should contain information about campuses, learners, tutors, managers, enrollment, fees, learning resources and related activities. For each learner, the database should at least store the learner id, last name, first name, email, age, gender, learning style (Fulltime or parttime) and phone number.
- For each tutor, the database should at least store their name, salary, type, email address and qualifications.
- Each tutor is assigned a maximum of five classes each term. Each class has a starting date, ending date and size, where minimum class size is 5 students and maximum size should be 20 students.
- Each campus runs four semesters fulltime each year and two parttime semesters each year.
- Student fees is 300$ per semester for Fulltime and 150$ for parttime. A learner who enrolls into more than one course per year gets 20?es deduction.
- Each course leader manages at least one tutor and maximum 10 tutors, whereas each tutor supervises minimum 5 students and maximum 20 students in each class.
- The college provides four different types of learning resources including videos, audios, PowerPoint presentations and reading text. Each learning resource should have relevant data such as a unique id, type, duration, author, date of issue etc.
- Each learning resource will be connected with a set of learning activities such as drag and drop, fill in the missing, write full answers, sort order of words or sentences etc. Each activity should have an id, issue date, author, marks etc.
- For each enrollment, it is required to store enrollment data such as learner id, course duration, fees, learner address and contact details etc. Each student can be enrolled only into one course at once.
Relevant assumptions can be made, if required.
Section 1 - Database (35%) (Learning Outcome 1 & 2)
- Using Crows foot notation, design a logical data model (ERD) forNewbury Business Schooldatabase in a professional modelling tool of your choice such as starUML or draw.io. All entity types, attributes and the relationships between them must be labelled clearly. Your design should show all cardinality and participation constraints accurately. Discuss and justify your design choices in detail.
- Produce a script using appropriate SQL commands (DDL) to create a set of database tables using MySQL relational database. Your tables should show all constraints applied at either the column or the table level. You are expected to implement the database so that it must meets the requirements described in the scenario.Populate all the tables in the database you created with some meaningful data (At least 5 records in each table).
- As a Data Analyst, produce 5 DML statements that will provide valuable insights for Newbury Business School. Your DML statements should contain at least one of the following:
- A statement involving a self-join.
- A statement involving an equi-join.
- A statement involving at least one group function
- A statement involving at least one subquery.
- A statement involving null values.
Section 2 - Data warehouse (20%) (Learning Outcome 1 & 2)
The management of Newbury Business School wants to gain competitive advantages over competitors and proposed to invest in latest tools and technologies that will enable them to make strategic as well as management decision. One of the suggestions is to build a Data Warehouse.
- Using Kimballs four step dimensional design process, draw/design either a star schema or a snowflake schema, which will help the management understand the business revenue and recruitments across the different countries in the world.
- Using the concept of a Data Cube, show all possible combination of revenue insight that can be derived from the Warehouse. Justify your Modelling choice.
Section 3 Business Intelligence (Tableau) (30%) (Learning Outcome 2 & 3)
You are hired as a Data Analyst by one of the leading supermarket in the US and provided with the Dataset in Appendix 1. You must use Tableau to analyse the data set provided and find answers to the following:
- Using Tableau, explain what data integrity checks you will perform on each of the columns in the Dataset. [3 marks]
- Given the item ordered comes with 60 days warranty from the date it was ordered. Use Tableau to calculate the Warranty End date. [3 marks]
- Using an appropriate chart, display the total sales made in each City. Order the result by the City with the highest sales. [3 marks]
- Using Tableau, display all cities that generated more than $3,000,000 for the year 2019. [3 marks]
- Using Tableau, display the top 5 selling products for the year 2019. Your chart should also show the quantity of products sold. [3 marks]
- Using Tableau, display the total sales made in each month for the year 2019. [3 marks]
- Given that the profit on each item is set at 30%. Use an appropriate chart to display the total sales and total profit made by each product for the year 2019. [3 marks]
- Using tableau, find the maximum sales generated in Boston, Los Angeles, Seattle and New York. [3 marks]
- Build a dashboard to display the results obtained in Number 3, 4, 5 and 6. [6 marks]
For questions 3 8, each answer should be on a separate worksheet with a chart, graph or table that shows your finding.
Section 4 Critical Reflection (15%) (Learning Outcome 4)
The manager of NBS management system has decided to create a social network dedicated for staff and students to share their experiences in learning new languages.
This social network will present a new digital platform and discussion forum to be developed on Web and smart phone devices. It will be used by the management as well to analyze data in order to improve their services to their students.
You are asked to investigate whether it is better to use a Relational Database or a NOSQL database for the new proposed social network. You are required to write a report to explain the factors that will influence your choice of database and why you will be recommending one type of database over the other.