Fertilisation curve
BIO3021_2024
Fertilisation curve
Loading packages
library(dplyr)
## ## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':## ## filter, lag
## The following objects are masked from 'package:base':## ## intersect, setdiff, setequal, union
library(ggplot2)
Data wrangling
Here we are labelling and uploading the data for the fertilisation curve.
fert_curve <- read.csv("fert_curve.csv")
This step is to ensure that there are no missing data points and the csv has loaded correctly.
str(fert_curve)
## 'data.frame': 20 obs. of 4 variables:## $ treatment : chr "none" "none" "none" "none" ...## $ fert : int 2 0 3 1 23 15 33 10 49 43 ...## $ total : int 50 50 50 50 50 50 50 50 50 50 ...## $ percent_fert: int 4 0 6 2 46 30 66 20 98 86 ...
summary(fert_curve)
## treatment fert total percent_fert ## Length:20 Min. : 0.00 Min. :50 Min. : 0.0 ## Class :character 1st Qu.:13.75 1st Qu.:50 1st Qu.:27.5 ## Mode :character Median :34.50 Median :50 Median :69.0 ## Mean :28.85 Mean :50 Mean :57.7 ## 3rd Qu.:44.00 3rd Qu.:50 3rd Qu.:88.0 ## Max. :49.00 Max. :50 Max. :98.0
Visualing data
Use these graphs in your results section. You can pick which graph you want to use.
Make sure the graph has: - Axes titles - No main title - Figure caption below the graph - No lines in the background of the graph - Your figure caption should describe ALL components of the graph (what is the graph showing?)
summary_data <- fert_curve %>% dplyr::group_by(treatment) %>% dplyr::summarize(mean_percentage = mean(percent_fert, na.rm = TRUE))ggplot(summary_data, aes(x = treatment, y = mean_percentage)) + geom_bar(stat = "identity", fill = "steelblue") + labs(x = "Sperm Concentration", y = "Mean Percentage of Fertilisations") + ggtitle("Percentage of Fertilisations Across Different Sperm Concentrations") + theme_minimal()
summary_data <- fert_curve %>% group_by(treatment) %>% summarize(mean_percentage = mean(percent_fert), se_percentage = sd(percent_fert) / sqrt(n()))ggplot(summary_data, aes(x = treatment, y = mean_percentage)) + geom_point(position = position_dodge(width = 0.5)) + geom_errorbar(aes(ymin = mean_percentage - se_percentage, ymax = mean_percentage + se_percentage), width = 0.2, position = position_dodge(width = 0.5)) + labs(x = "Sperm Concentration", y = "Mean Percentage of Fertilisations") + ggtitle("Percentage of Fertilisations Across Different Sperm Concentrations") + theme_minimal()
Model structure
Running a one way anova.
We are using a one way anova because we have a categorical variable which are the different sperm concentrations we are comparing, and a continuous response variable (percentage of fertilisations), and we want to determine if there are significant differences among their means.
fert_curve.aov <- aov(percent_fert ~ treatment, data = fert_curve)
Results
Creating a summary table to see if there are any significant effects.
There is a significant main effect of treatment (p < 0.05).
Make sure to report: - the main effect (p value) - report the F (value of the test) and Df value - p values larger than 0.01 should be reported to two decimal places, and those between 0.01 and 0.001 to three decimal places; p values smaller than 0.001 should be reported as p<0.001
summary(fert_curve.aov)
## Df Sum Sq Mean Sq F value Pr(>F) ## treatment 4 20197 5049 18.5 1.15e-05 ***## Residuals 15 4093 273 ## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We are performing a post hoc test to disentangle the main effect of treatment.
This is showing a pairwise comparison of the different treatments. Read from left to right if the diff value is positive and right to left if the diff value is negative. For example, 10^5 had a significantly greater number of fertilisations compared to 10^4 (first line of results), and 10^4 had greater percentage of fertilisations compared to none (fourth line of results).
When reporting values, present the diff comparisons and p value (can do this in table or can include the stats in your results summary paragraph, Id prefer the latter).
TukeyHSD(fert_curve.aov)
## Tukey multiple comparisons of means## 95% family-wise confidence level## ## Fit: aov(formula = percent_fert ~ treatment, data = fert_curve)## ## $treatment## diff lwr upr p adj## 10^5-10^4 40.0 3.931597 76.068403 0.0264646## 10^6-10^4 37.5 1.431597 73.568403 0.0397407## 10^7-10^4 46.0 9.931597 82.068403 0.0098140## none-10^4 -37.5 -73.568403 -1.431597 0.0397407## 10^6-10^5 -2.5 -38.568403 33.568403 0.9994755## 10^7-10^5 6.0 -30.068403 42.068403 0.9846536## none-10^5 -77.5 -113.568403 -41.431597 0.0000673## 10^7-10^6 8.5 -27.568403 44.568403 0.9466553## none-10^6 -75.0 -111.068403 -38.931597 0.0000971## none-10^7 -83.5 -119.568403 -47.431597 0.0000286
Checking assumptions
This step is to ensure that the assumptions of ANOVA are met. These include homogeneity of variances and normality of residuals.
Assumptions were met. You dont need to include this graph in your results section but describe that you have checked the assumptions in the stats methods section.
par(mfrow = c(2,2))plot(fert_curve.aov)
# Experiment
Data wrangling
fert_copper <- read.csv("fert_copper.csv")
str(fert_copper)
## 'data.frame': 23 obs. of 4 variables:## $ treatment: chr "e" "es" "ex" "sx" ...## $ fert : num 1 1.5 6 2 10 5.5 1 4.5 10.5 22.5 ...## $ replicate: int 1 1 1 1 2 2 2 3 3 3 ...## $ date : chr "13/2/2024" "13/2/2024" "13/2/2024" "13/2/2024" ...
The step above showes that R was recognising fert as a character instead of an integer so we are converting it in this step to ensure that R is treating fert as an integer.
fert_copper$fert <- as.integer(fert_copper$fert)
summary(fert_copper)
## treatment fert replicate date ## Length:23 Min. : 0.000 Min. :1.000 Length:23 ## Class :character 1st Qu.: 1.000 1st Qu.:2.000 Class :character ## Mode :character Median : 4.000 Median :4.000 Mode :character ## Mean : 6.739 Mean :3.565 ## 3rd Qu.:10.000 3rd Qu.:5.000 ## Max. :22.000 Max. :6.000
Visualing data
summary_data_2 <- fert_copper %>% dplyr::group_by(treatment) %>% dplyr::summarize(mean_fert = mean(fert, na.rm = TRUE))ggplot(summary_data_2, aes(x = treatment, y = mean_fert)) + geom_bar(stat = "identity", fill = "steelblue") + labs(x = "Treatment", y = "Mean Percentage of Fertilisations") + ggtitle("Percentage of Fertilisations Across Different Treatments") + theme_minimal()
summary_data_2 <- fert_copper %>% group_by(treatment) %>% summarize(mean_fert = mean(fert), se_fert = sd(fert) / sqrt(n()))ggplot(summary_data_2, aes(x = treatment, y = mean_fert)) + geom_point(position = position_dodge(width = 0.5)) + geom_errorbar(aes(ymin = mean_fert - se_fert, ymax = mean_fert + se_fert), width = 0.2, position = position_dodge(width = 0.5)) + labs(x = "Treatment", y = "Mean Percentage of Fertilisations") + ggtitle("Percentage of Fertilisations Across Different Treatments") + theme_minimal()
## Model structure
fert_copper.aov <- aov(fert ~ treatment, data = fert_copper)
Results
No difference between treatments in fertilisation success.
summary(fert_copper.aov)
## Df Sum Sq Mean Sq F value Pr(>F)## treatment 3 269.5 89.82 2.104 0.133## Residuals 19 811.0 42.68
Checking assumptions
par(mfrow = c(2,2))plot(fert_copper.aov)
BIO3021_2024
Fertilisation curve
Loading packages
library(dplyr)
## ## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':## ## filter, lag
## The following objects are masked from 'package:base':## ## intersect, setdiff, setequal, union
library(ggplot2)
Data wrangling
Here we are labelling and uploading the data for the fertilisation curve.
fert_curve <- read.csv("fert_curve.csv")
This step is to ensure that there are no missing data points and the csv has loaded correctly.
str(fert_curve)
## 'data.frame': 20 obs. of 4 variables:## $ treatment : chr "none" "none" "none" "none" ...## $ fert : int 2 0 3 1 23 15 33 10 49 43 ...## $ total : int 50 50 50 50 50 50 50 50 50 50 ...## $ percent_fert: int 4 0 6 2 46 30 66 20 98 86 ...
summary(fert_curve)
## treatment fert total percent_fert ## Length:20 Min. : 0.00 Min. :50 Min. : 0.0 ## Class :character 1st Qu.:13.75 1st Qu.:50 1st Qu.:27.5 ## Mode :character Median :34.50 Median :50 Median :69.0 ## Mean :28.85 Mean :50 Mean :57.7 ## 3rd Qu.:44.00 3rd Qu.:50 3rd Qu.:88.0 ## Max. :49.00 Max. :50 Max. :98.0
Visualing data
Use these graphs in your results section. You can pick which graph you want to use.
Make sure the graph has: - Axes titles - No main title - Figure caption below the graph - No lines in the background of the graph - Your figure caption should describe ALL components of the graph (what is the graph showing?)
summary_data <- fert_curve %>% dplyr::group_by(treatment) %>% dplyr::summarize(mean_percentage = mean(percent_fert, na.rm = TRUE))ggplot(summary_data, aes(x = treatment, y = mean_percentage)) + geom_bar(stat = "identity", fill = "steelblue") + labs(x = "Sperm Concentration", y = "Mean Percentage of Fertilisations") + ggtitle("Percentage of Fertilisations Across Different Sperm Concentrations") + theme_minimal()
summary_data <- fert_curve %>% group_by(treatment) %>% summarize(mean_percentage = mean(percent_fert), se_percentage = sd(percent_fert) / sqrt(n()))ggplot(summary_data, aes(x = treatment, y = mean_percentage)) + geom_point(position = position_dodge(width = 0.5)) + geom_errorbar(aes(ymin = mean_percentage - se_percentage, ymax = mean_percentage + se_percentage), width = 0.2, position = position_dodge(width = 0.5)) + labs(x = "Sperm Concentration", y = "Mean Percentage of Fertilisations") + ggtitle("Percentage of Fertilisations Across Different Sperm Concentrations") + theme_minimal()
Model structure
Running a one way anova.
We are using a one way anova because we have a categorical variable which are the different sperm concentrations we are comparing, and a continuous response variable (percentage of fertilisations), and we want to determine if there are significant differences among their means.
fert_curve.aov <- aov(percent_fert ~ treatment, data = fert_curve)
Results
Creating a summary table to see if there are any significant effects.
There is a significant main effect of treatment (p < 0.05).
Make sure to report: - the main effect (p value) - report the F (value of the test) and Df value - p values larger than 0.01 should be reported to two decimal places, and those between 0.01 and 0.001 to three decimal places; p values smaller than 0.001 should be reported as p<0.001
summary(fert_curve.aov)
## Df Sum Sq Mean Sq F value Pr(>F) ## treatment 4 20197 5049 18.5 1.15e-05 ***## Residuals 15 4093 273 ## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We are performing a post hoc test to disentangle the main effect of treatment.
This is showing a pairwise comparison of the different treatments. Read from left to right if the diff value is positive and right to left if the diff value is negative. For example, 10^5 had a significantly greater number of fertilisations compared to 10^4 (first line of results), and 10^4 had greater percentage of fertilisations compared to none (fourth line of results).
When reporting values, present the diff comparisons and p value (can do this in table or can include the stats in your results summary paragraph, Id prefer the latter).
TukeyHSD(fert_curve.aov)
## Tukey multiple comparisons of means## 95% family-wise confidence level## ## Fit: aov(formula = percent_fert ~ treatment, data = fert_curve)## ## $treatment## diff lwr upr p adj## 10^5-10^4 40.0 3.931597 76.068403 0.0264646## 10^6-10^4 37.5 1.431597 73.568403 0.0397407## 10^7-10^4 46.0 9.931597 82.068403 0.0098140## none-10^4 -37.5 -73.568403 -1.431597 0.0397407## 10^6-10^5 -2.5 -38.568403 33.568403 0.9994755## 10^7-10^5 6.0 -30.068403 42.068403 0.9846536## none-10^5 -77.5 -113.568403 -41.431597 0.0000673## 10^7-10^6 8.5 -27.568403 44.568403 0.9466553## none-10^6 -75.0 -111.068403 -38.931597 0.0000971## none-10^7 -83.5 -119.568403 -47.431597 0.0000286
Checking assumptions
This step is to ensure that the assumptions of ANOVA are met. These include homogeneity of variances and normality of residuals.
Assumptions were met. You dont need to include this graph in your results section but describe that you have checked the assumptions in the stats methods section.
par(mfrow = c(2,2))plot(fert_curve.aov)
# Experiment
Data wrangling
fert_copper <- read.csv("fert_copper.csv")
str(fert_copper)
## 'data.frame': 29 obs. of 4 variables:## $ treatment: chr "e" "es" "ex" "sx" ...## $ fert : num 1 1.5 6 2 2.5 10 5.5 1 10 4.5 ...## $ replicate: int 1 1 1 1 1 2 2 2 2 3 ...## $ date : chr "13/2/2024" "13/2/2024" "13/2/2024" "13/2/2024" ...
The step above showes that R was recognising fert as a character instead of an integer so we are converting it in this step to ensure that R is treating fert as an integer.
fert_copper$fert <- as.integer(fert_copper$fert)
summary(fert_copper)
## treatment fert replicate date ## Length:29 Min. : 0.000 Min. :1.000 Length:29 ## Class :character 1st Qu.: 2.000 1st Qu.:2.000 Class :character ## Mode :character Median : 4.000 Median :4.000 Mode :character ## Mean : 7.931 Mean :3.552 ## 3rd Qu.:10.000 3rd Qu.:5.000 ## Max. :34.000 Max. :6.000
Visualing data
summary_data_2 <- fert_copper %>% dplyr::group_by(treatment) %>% dplyr::summarize(mean_fert = mean(fert, na.rm = TRUE))ggplot(summary_data_2, aes(x = treatment, y = mean_fert)) + geom_bar(stat = "identity", fill = "steelblue") + labs(x = "Treatment", y = "Mean Percentage of Fertilisations") + ggtitle("Percentage of Fertilisations Across Different Treatments") + theme_minimal()
summary_data_2 <- fert_copper %>% group_by(treatment) %>% summarize(mean_fert = mean(fert), se_fert = sd(fert) / sqrt(n()))ggplot(summary_data_2, aes(x = treatment, y = mean_fert)) + geom_point(position = position_dodge(width = 0.5)) + geom_errorbar(aes(ymin = mean_fert - se_fert, ymax = mean_fert + se_fert), width = 0.2, position = position_dodge(width = 0.5)) + labs(x = "Treatment", y = "Mean Percentage of Fertilisations") + ggtitle("Percentage of Fertilisations Across Different Treatments") + theme_minimal()
## Model structure
fert_copper.aov <- aov(fert ~ treatment, data = fert_copper)
Results
No difference between treatments in fertilisation success.
summary(fert_copper.aov)
## Df Sum Sq Mean Sq F value Pr(>F)## treatment 4 427.4 106.85 1.652 0.194## Residuals 24 1552.5 64.69
Checking assumptions
par(mfrow = c(2,2))plot(fert_copper.aov)