Planning and Creating Static Visualizations Assessment 2
Planning and Creating Static Visualizations Assessment 2
1710055-554609000James Cook University Data Visualization
INTRODUCTION
The raw data given for analysis is given below in table 1. It gives the marks attained by 8 students for 9 subjects. Three visualizations are created below for 3 different audience which serve three different purposes.
1143000221386
TABLE 1
VISUALIZATION 1
Why
Target audience
The target audience for this visualization is lecturers of science department who teach mathematics for first year students.
Intended meaning of visualization
This visualization is made for teachers to identify students who are poor in mathematics and to understand which students are good, average and below average considering the average to be 50% for mathematics.
Assumptions required:
Data assumption:
Average mark for subject under study is 50%
The lecturer can understand or interpret
statistical terms such as mean, range.
Percentage of marks
How normal or desired distribution of grades look like graphically
The marks are assumed not to be adjusted to meet the average
Chart type assumption:
The teacher understands layman charts-bar charts, pie charts
Time assumption:
The teacher is not time poor and has minimum of 5 to 10 minutes to analyze the visualization and get the desired conclusion about the marks of students in mathematics subject.
Intend:
To understand which student has poor marks and need attention relative to the subject average.
Desired response:
Rapid interpretation.
Special attention on students having below average marks.
Appreciate students with good marks and ask them to help other students.
What
Students Subject A B C D E F G H
Mathematics 80 67 71 90 34 66 76 45
TABLE 2
Name Description Type Range of values Derivation Intend
Subject Particular subject for which exam was conducted Categorical nominal 9 subjects Given data To know the corresponding subjects of scores given
Student Student whose scores are displayed Categorical nominal A to H Given data To know the student corresponding to
each mark
Score The marks scored in subjects Numerical Discrete 23 to 96 Given data Marks scored in percentage to calculate and provide conclusions.
Attention An indicator to express concern Categorical ordinal Less attention
Medium attention
High attention Compute by checking if the score is high, above average, or below average by assuming average of
subject to be 50%. To take the decision about a student on the attention required by the teacher.
TABLE 3
The dataset used for analysis is a sample of 8 students [A, B, C, D, E, F, G, H] across one subject [Mathematics] out of 9 subjects available in the raw data. The dataset is then sorted to obtain a visualization clearer and understand it quickly.
The data attributes shown in table are required to answer the questions for further information about how attributes were derived, please refer to table 1.
Attribute type
subject categorical
score Numerical
student categorical
Attention Categorical
TABLE 4
Measure of dispersion range is calculated from the above dataset which contains mathematics as the subject and the corresponding marks for 8 students.
Range of the dataset is calculated by subtracting the maximum and the minimum marks obtained by the students to know the variation.
Range = 90 34 = 56 Range is found to be high. This arrives to the conclusion that there is a high variation between the students of same class.
Limitations:
The table gives small sample and small cohort
Results given in percentage and therefore original marks attained is unknown.
How
VIZ 1
There are three attributes that is represented in the visualization that is, subject scores, students and attention required relative to average (taken as 50).
Subject and average are considered the primary attribute as one is categorical and other is numerical. The visualization used is a histogram. The bar chart is simple, highly familiar for the audience. The bars should be wider than the white space between the bars. As you add more series of data, it becomes more difficult to focus on one at a time (Knaflic, 2015). Therefore, only one series is used in the above visualization. Thus, it will reduce cognitive load on the user. Clutter has been reduced in the visualization. The gestalt principle of continuity and similarity is used in the visualization. Due to the alignment of the bottom of the bars and the consistent amount of white space enables user to understand the bars start from common position thereby avoiding the x axis and labels help removing axes and gridlines. The figure is ordered so that students who is in need of most attention is located in a position most likely to be noticed.
The visualization in figure gives idea about which students have scored well and which students have scored below average and needs more attention and practice. As teachers it was assumed that they know the layman charts and the normal looks of the chart for a high as well as low marks. Histogram is opted because of its simplicity and as it gives a rapid idea about the data.
According to the visualization student A, C, D, G have good marks above the average 50. And therefore, they need not be given much attention by the teacher while B and F needs some attention as they dont have a much better result. H and E needs more attention from the lecturer because they have a score below the average 50%. This is the conclusion that a person viewing this visualization can get.
VISUALIZATION 2 WHY
Target Audience:
The target audience is the dean of science stream of an institution.
Intended meaning of visualization
This visualization is made for the deans to get information about the subjects in which students perform good and in which they are poor.
Assumptions required:
Data assumption:
Intend:
The dean can understand or interpret statistical terms such as measures of central tendency mean, measure of dispersion- variance, standard deviation.
How normal or desired distribution of look like graphically
Chart type assumption:
The dean understands layman charts-bar charts.
Time assumption:
The dean is time poor to analyze the visualization and get the desired conclusion. He or she is usually very busy and has to arrive to a conclusion in a very short span of time.
To understand in which subjects the students perform awful and in which subject the student perform well relative to the subject average and the deviation from the subject means.
Desired response:
Rapid interpretation which is in line with the assumption.
Check whether the lecturer taking the subject is qualified and have a good ability to keep up the standard of the institution.
To arrange the hours of teaching according to performance of the students and difficulty of the subject.
What
Subject Average of
subject
Database 76.25
Java 66.5
C++ 76.5
Physics 67.5
Mathematics 66.125
English 50
Chemistry 45.5
Networks 52.125
Security 54.5
Name Description Type Range of values Derivation Intend
Subject subject for which exam was conducted Categorical nominal 9 subjects Given data To know the corresponding subjects of scores given
Student Student whose scores are displayed Categorical nominal A to H Given data To know the student corresponding to
each mark
Subject average The average of marks scored in each subject Numerical Discrete 45.5 to 76.5 calculated Average marks calculated to find the subject in which students are strong and weak.
Attention An indicator to express concern Categorical ordinal Low attention
High attention Compute by checking if the average is high, low, or very low To take the decision about a teacher or the hours of teaching.
The dataset used for analysis is a sample of 9 subjects across one the averages of marks obtained by 8 students in that subject. The dataset is then ordered to obtain a visualization clearer and understand it quickly.
Limitations:
Here the limitations are very small sample size. The conclusions arrived by these small sample might not be true for a population.
Another limitation is that we dont have the data of teaching hours to draw a decision based on that. Therefore, we might not be able to specify if the reason for poor performance is due to inefficient teacher or less teaching hours or difficulty in the subjects.
HOW
909955-3112770C++
Database Physics
Java Mathematics
Security Networks English
Chemistry
0
10
20
30
40average50
60
70
80
90
00C++
Database Physics
Java Mathematics
Security Networks English
Chemistry
0
10
20
30
40average50
60
70
80
90
VIZ 2
In this visualization we have used horizontal bar chart. The visualization principles used here are like those used in the visualization. Here each subject is plotted against the averages of marks scored by the students in that subject. The color combinations are used to draw attention towards those subjects as soon as the user sees the chart because that is where the dean must make improvements. The other subjects have a good average score and therefore can neglected.
In figure chemistry and English has a low average score whereas network has an average score just above English and needs some more improvement. The dean can quickly go through the chart and get the desired response.
VISUALIZATION 3 WHY
Target audience
The target audience is the first-year science students
Intended meaning of visualization
This visualization is made for students to know their marks in subjects.
Assumptions required:
Data assumption:
Pass mark for subject under study is 40%
High school student
The student can understand or interpret
statistical terms such as mean.
Percentage of marks
Chart type assumption:
The student understands layman charts-bar charts, pie charts.
Time assumption:
The student has lot of time to analyze the visualization
Intend:
To know the percentage of marks he or she has attained and improve wherever necessary.
Desired response:
Work on subjects with low marks.
Give for revaluation if the student finds misinterpretation of marks.
To keep up the subjects in which the student has very good performance.
WHAT
Subject Student D
Database 78
Java 83
C++ 68
Physics 72
Mathematics 90
English 48
Chemistry 23
Networks 23
Security 46
The dataset used for analysis is a sample of 9 subjects across one student[D] out of 8 students available in the raw data. The dataset is then sorted to obtain an ordered visualization which can easily comprehended.
Subject Student D
Database 78
Java 83
C++ 68
Physics 72
Mathematics 90
English 48
Chemistry 23
Networks 23
Security 46
Average 59
Name Description Type Range of values Derivation Intend
Subject Particular subject for which exam was conducted Categorical nominal 9 subjects Given data To know the corresponding subjects of scores given
Student Student whose scores are displayed Categorical nominal D Given data To know the student corresponding to
each mark
Score average The average of marks scored in each subject Numerical Discrete 59 Calculated Average marks calculated to find the subject in which students are strong and weak.
Attention An indicator to express concern Categorical ordinal Less attention
Medium attention
High attention Compute by checking if the average is high, low or very low To take the decision about a teacher or the hours of teaching.
909955359410Security
Networks Chemistry
English
Mathematics
Physics
C++
Java Database
0
10
20
30
40
50
60
70
80
90
B
00Security
Networks Chemistry
English
Mathematics
Physics
C++
Java Database
0
10
20
30
40
50
60
70
80
90
B
10864851261110Subject
00Subject
HOW
VIZ 3
In this visualization we have used horizontal bar chart. The visualization principles used here are like those used in the previous visualization. Here each subject is plotted against the marks scored by the students in each subject.
REFERENCES:
(Munzner & Maguire, 2015) (Knaflic, 2015)