BMS3: Data Analysis Report ICA 2023-24
BMS3: Data Analysis Report ICA 2023-24
*Deadline is 13:00 on Tuesday 21 November 2023*
Read these instructions carefully.
This ICA consists of four questions (total 100 marks): Question 1 (50 marks), Question 2 (30 marks), Question 3 (10 marks) and Question 4 (10 marks).
Think carefully about how to present information. An excellent data analysis report will combine:
(1) Clear, concise and accurate presentation.
(2) Appropriate analysis and reporting of data.
(3) Overall coherence = making it easy for the reader to grasp the salient points and conclusions (please see additional point below).
Overall coherence is integral to this assessment. You will need to make decisions about what material needs to be included, how it is presented and explained, and what is peripheral and should not be included because it degrades clarity.
*Question 1 has no page limit or word limit. However, the mark will assess your ability to communicate clearly and concisely. Your answer should only be as long as necessary and no longer. As noted above, part of your Q1 mark will assess coherent and concise presentation. We do not set a page limit primarily to provide flexibility with reporting figures (for example, residual plots).
*Question 2 must not exceed a single side of A4*
*Question 3 has a word limit of 50 words*
*Question 4 has a word limit of 200 words*
*All font size must be 12 point*
IMPORTANT: Include your exam number in the name of the document that you upload to Learn. Note: we highly recommend that you submit your document in pdf format.
Question 1 (50 marks):For this question, we ask you to prepare a mini-report that allows you to use the skills you have learned in BMS3. You may additionally draw from BMS2 if you wish, by using a t-test or randomization test to analyze your data.
Your task is to (1) think of a question or choose a question from the list below, (2) collect appropriate data to answer the question, (3) analyze the data, (4) draw a conclusion, and (5) communicate your work in a mini-report.
If you want to proceed with your own question please note the following:(i) your question need not be biological in nature; (ii) before proceeding you must obtain approval from Crispin Jordan (to ensure the question raises no ethical concerns); (iii) to avoid ethical concerns, data collection cannot involve interaction (e.g. speaking) with other animals, including humans - however, observation of animals (including humans) is allowed.
Or please choose a question from the list below:
Does the probability of whether a dog-walker has curly hair associate with whether the dog also has curly hair?
Does the probability of a dog being on a lead or not differ among regions (e.g. various parks) of Edinburgh?
Does the average sugar content per serving differ among cereals presented on the top, middle, vs, bottom shelves of supermarkets?
Does the average (estimated) price of a car parked on the street differ between regions of Edinburgh?
Does the average price of groceries differ between stores?
Does a difference in average price of groceries between stores depend on the type of item (e.g. dairy vs. baking products)?
Does the average size (e.g. circumference) of a tree's trunk differ between parks?
How does the height of a tree vary with the size of its trunk? (There are ways using trigonometry to estimate height by measuring the distance from a tree and the angle required to look at the top of the tree.)
Does the relationship between a tree's height and the size of its trunk differ between species?
How does the amount of green space of a region (e.g. city or neighbourhood) relate with the average income of that area? (Data obtained from the internet.)
Does the amount of green space per city differ between countries within the UK? Does this answer change when we account for population size of the cities? (Data obtained from the internet.)
Your mini-report must include the following sections:1. The question and population you address (5 marks). State the question that your mini-report addresses. State the population for which you aim to answer your question. For example, do you aim to answer a question that informs us about all of Edinburgh? About a specific neighbourhood? A specific population of squirrels? (Note: you need to state your question to allow a marker to understand the purpose of your work. Marks will not be associated with simply stating the question as some students may choose to use one of the suggested question; marks will be associated with how you articulate the population to which the results aim to apply.)
2. Methods (20 marks). Explain what you did in sufficient detail to allow another person to replicate your work. In addition briefly state the analysis you will use, and why. (Notes on sample size: (i) students are not expect to conduct a power analysis as this is not taught in BMS3; (ii) at a minimum, sample size must match requirements for a test, for example t-tests and 1-Factor GLMs require at least 2 independent observations per group to allow any analysis; (iii) the sample size obtained should reflect several hours of work; (iv) higher marks will not be awarded for sample sizes that exceed point (iii). However, students should note that analyses and forming conclusions are often easier with larger sample sizes; therefore, any additional time spent by collecting more data may be offset by time saved with easier analyses and interpretation.)
3. Results (20 marks). Report your results as you learn to do throughout BMS3. For example, you learn that you need to report the following for a 1-Factor GLM: a figure that presents the individual measurements plus a box-plot or means with SE's, the tests name, the test statistic, degrees of freedom, p-value(s), means with SE's, and effect sizes with SE's (and optionally 95% CI's). Please note that not all types of test will require all of these details; for example, you would not present 95% CI's when you use a Chi-square test (although you can comment on the data to describe the effect size). The Methods should also report whether your data met all of the assumptions for your test, and include appropriate information (e.g. residual plots) that allow a marker to judge for themselves whether assumptions were met. (Note that we require this last information pertaining to evidence that data meet assumptions only because the marker is not familiar with the data (unlike for other questions in this ICA); in other circumstances (e.g. when publishing paper or for Questions 2 of this ICA) you would not provide such evidence (e.g. residual plots) of whether the data met the assumptions of the analysis.)
4. Conclusion(s) (5 marks). Present an appropriate conclusion that includes comment on (lack of) trends in the data and effect size (if your analysis allows estimation of effect size). The marking here will focus on whether you have appropriately interpreted your results in light of (i) your analysis, (ii) the question you aimed to answer, and (iii) the population for which you aim to apply the conclusions. The Conclusions section should not include a Discussion (for example, of how your results fit into a larger understanding of science).
Additional guidance / notes:
We expect your work to conform to good principles of experimental design and part of your mark will assess whether you have followed these principles.. You should familiarize yourself with these good principles by reviewing Chapter 9:Experimental design" in the website "Experiments and what to do with them": https://www.ed.ac.uk/biomedical-sciences/experimental-design-and-data-analysis/what-to-do-with-experiments. That said, given the logistical challenges involved, we do not expect your experiment to implement "blinding". In addition your experiment does not need to include blocking or covariates.
The purpose of this question is to engage students to apply principles of experimental design, data collection, data analysis, appropriate interpretation of results, and presentation of work in a clear and concise manner. While 'creativity' is a cornerstone of science, creativity per se is not a learning objective here. Therefore, answers that address novel questions will be marked in the same way as those that address one of the suggested questions.
Question 2 (30 marks):
Fuller et al (2010) studied the effects of environment and genetics on Bluefin killifish sensitivity to blue light. Two genetic populations of fish were studied (Spring water or Swamp water) under two water clarity conditions (Clear water or Tea water). Fish were selected randomly from each population, and each fish was randomly assigned to one of the water clarity conditions. The researchers measured the expression of the SWS1 gene (short wave sensitive) once per fish.
The data (taken from Chapter 18 of Whitlock & Schluters text The analysis of Biological Data.) are in the file OpsinChap18.csv. Note that these data are unbalanced.
Decide upon the most appropriate approach to analyse these data and assess whether the data meet the assumptions of your approach.
Test whether (and how) genetic background (i.e., population) and water clarity affect blue light sensitivity.
Report your conclusions following the approaches illustrated in BMS3 lectures, including:
An appropriate figure (and caption) to illustrate the data.
All appropriate details of the results. Among these details, ensure that you:
List all of the tests assumptions.
Explain whether your data meet the required assumptions and how you tested them. If the original data do not meet the tests assumptions, explain how you approached your analysis to reach a valid conclusion.
Appropriate means and standard errors for the appropriate groups of data.
Your explanation of whether (and how) genetic background (i.e., population) and water clarity affects blue light sensitivity. Base your conclusions on both p-values and effect size.
Question 3 (10 marks):
A scientist who was interested in mice wished to test whether a mouses heart-rate depended on whether it was running uphill versus running downhill. The scientist randomly selected 8 mice from a wild population, and randomly assigned 4 mice to each of two treatments: Uphill (or, Up), and Downhill (or Down). In each treatment, a mouse was offered a piece of cheese placed 2m away, either at the top of a hill above the mouse (Up) or at the bottom of a hill, below the mouse (Down). The scientist measured the mouses heart-rate when it reached the cheese (the cheese served a as reward, causing the mouse to run). [These data are obviously not real.]
After using 1-way ANOVA, the scientist concluded that running uphill significantly increased a mouses heart-rate compared to running downhill.
Is the scientist justified in making this conclusion? In 50 words or fewer, explain why or why not the scientist is justified in making this conclusion, using the Figure and ANOVA output, below.
Question 4 ( 10 marks):
A physiologist wished to determine how exercise affected respiration in mice. The physiologist randomly assigned mice to one of four treatments, where each mouse had either:
i) a 1 gram weight attached to its back (1g);
ii) a 5 gram weight attached to its back (5g);
iii) a 10 gram weight attached to its back (10g);
iv) no weight attached to its back (Control).
After each mouse had experienced its treatment for 15 minutes, the physiologist measured its rate of respiration.
To understand the effect of the three Weight treatments relative to the Control, the physiologist determined the difference between the average respiration rate for each Weight treatment and the Control (e.g., for the 1g treatment, the physiologist calculated: Respiration rate for 1g treatment Respiration rate for Control); they also determined the 95% Confidence Interval (CI) for each of these differences. These results are illustrated in the figure below: a dot indicates the mean difference in respiration rate between a Weight treatment and the Control, and each dots surrounding lines represent the 95% CI for that difference. The dotted line indicates zero on the y-axis; positive values lie above the dotted line.
Interpret the results, below, with respect to effect size. Specifically, comment upon evidence, or lack thereof, for differences among Weight treatments (relative to the Control) with respect to their effects on respiration rate. For example, given the results below, can the researcher say that one Weight treatment affected respiration the least, or the most? [Note that this question does not ask you to interpret why differences arose (or not) among treatments.] (Please use 200 words or fewer).
[END OF Data Analysis Report ICA]