London Marathon Winners Since 1981

STA/ISS 313 - Spring 2024 - Project 1

team_ryan: Ryan Yu, Sam Alkalay, Rohit Gunda, Cooper Likosar

Introduction

London Marathon Dataset

  • Data from Tidy Tuesday’s London Marathon page, which has two datasets: winners.csv and london_marathon.csv

  • winners: year, category of race (men, women, wheelchair men, wheelchair women), nationality, and winning time.

  • london_marathon: date of race, number of applicants, number accepted, number of starters, number of finishers, how much was raised for charity, and each year’s official charity.

Question 1: Examining Relative Win Time

Question 1 Introduction & Approach

We want to answer how winning performances among the four different race categories have changed over time and how these performances vary from year to year.

  • Show the changes in the winning race times over our time period and the variability from year to year using geom_line() and geom_point()
  • Using geom_col() to create a sideways bar chart showing the relationship between winning time and temperature

Plots

Plot 1.1:

This figure is a line graph with points at each year titled “Relative Winning Time Over Time by Race Category”. Relative winning time is defined as the winning time for that year divided by the average winning time from 1981-2020. It has lines for Men’s Running, Women’s Running, Wheelchair Men and Wheelchair Women. Times from both Wheelchair Men and Wheelchair Women have remained relatively constant with relative win time around 0%. Men’s Running and Women’s Running started above 80% and have trended downs since then, with both remaining below 0% since 2003.

Plot 1.2:

This figure is a line graph with points at each year titled “Relative Winning Time Over Time by Race Category”. Relative winning time is defined as the winning time for that year divided by the average winning time from 1981-2020. It has lines for Men’s Running, Women’s Running, Wheelchair Men and Wheelchair Women. This is a zoomed in version of the previous plot, only including relative win times under 20%. Zooming in allows us to better notice the trends of relative win times for men and women running: the relative win times trend from +5% in the beginning of the time period to nearly below 5% of the average win time.

Plot 1.3:

This figure is a column graph titled “Relative Wheelchair Win Times Plotted Against Relative Temperature” that displays relative win times for Men’s and Women’s Wheelchair along with the relative average temperature by year. The plot shows that 2018 had the highest relative temperature and 1991 had the lowest. We are visualizing the relative win time via a sideways bar chart rather than a line chart which was used in the first two plots.

Plot 1.4:

This figure is a column graph titled “Relative Running Win Times Plotted Against Relative Temperature” that displays relative win times for Men’s and Women’s Running along with the relative average temperature by year. The plot shows that 2018 had the highest relative temperature and 1991 had the lowest.

Discussion

Question 1 Discussion

  • Wheelchair Men and Women started with the largest relative win time (+85% and +95% for men and women wheelchair, respectively), but has decreased significantly over time

  • Included zoomed-in version of the plot to gain more insight into trends of men & women running over time

  • There appears to be no apparent relationship between temperature and marathon winning time

Question 2: Examining Race Acceptance Rate

Question 2 Introduction & Approach

We’re seeking to better understand how the London Marathon acceptance rate varies from year to year from 1981-2020 and gain insight into potential trends.

  • Show the deviation from the overall average acceptance rate by utilizing geom_point() and geom_segment()
  • Use annotate() to display mean acceptance rate, provide context, and show outliers

Plots

Plot 2.1:

The figure is a scatter plot with line segments titled “London Marathon Acceptance Rate Over the Years” with a subtitle “Rates from 1981-2020”. Each point is connected to a line at 36.3%, the mean acceptance rate from 1981-2020. The acceptance rate was above the mean in 1981 and from 1988 to 2008 and was below the mean acceptance rate for all other years. 2020 is labeled as an outlier with an almost 0% acceptance rate, with an annotation indicating this was due to COVID where only elite runners were permitted to compete.

Plot 2.2:

The figure is a line graph with points at each year titled “London Marathon Acceptance Rate Over the Years” with a subtitle “Separated by Decade”. It is facet by decade with Decade 1 from 1981 to 1990, Decade 2 from 1991 to 2000, Decade 3 from 2001 to 2010, and Decade 4 from 2011 to 2020. Decade 1 had an acceptance rate of 34.9%. Decade 2 had an acceptance rate of 48.6%. Decade 3 had an acceptance rate of 39.6%. Decade 4 had an acceptance rate of 21.9%. Decade 1 and 4 were below the mean acceptance rate of 36.3% while Decade 2 and 3 were above.

Discussion

Question 2 Discussion

  • London Marathon acceptance rate was above the mean in 1981 and from 1988 to 2008 and was below the mean acceptance rate for all other years

  • 2020 had an almost 0% acceptance rate due to COVID-19 pandemic, as only elite runners were permitted to compete

  • Decade 1 and 4 were below the mean acceptance rate of 36.3% while Decade 2 and 3 were above

Thanks for Listening!