Total Medals Over the Years with 2024 Forecast

Medals Over The Years with 2024 Forecast

Athletes, Events, and Sports

Men and Women Over the Years with 2024 Forecast

Predicted Medals by Sport

Predicted Medals by Sport: Sex

Age vs Predicted Medal by Sex

Age vs Height by Predicted Medal

Weight Distribution by Predicted Medal

Weight vs Height by BMI and Predicted Medal

Violin Plot: Age Distribution by Predicted Medal

Nation Count

Discipline Count

Distribution of Athletes Age

Distribution of Athletes Height by Sex

Distribution of Athletes Weight by Sex

Top 15 Sports Distribution

Medal Distribution

Medal Distribution by Country

Approach 1: Time-series Forecast
Our first approach to forecasting Olympic performance for the Paris 2024 Olympics involved leveraging the Vector Autoregressive (VAR) model to capture the dynamic relationships and dependencies among multiple time series variables, making it ideal for forecasting complex systems like Olympic data. The VAR model is a multivariate time series model that extends the autoregressive model by allowing multiple time series variables to influence each other simultaneously. In our context, we utilized historical data spanning several Olympic cycles to predict future medal counts and, the number of male and female athletes across different countries and events.
Medals (Gold, Silver, Bronze) Over the Years with 2024 Forecast

To validate our VAR model, we utilized 2021 data to compare our forecasts against actual medal counts. The model's ability to accurately predict medal counts for the United States in 2021 demonstrated its robustness and effectiveness.

Geographical Distribution of Total Medals

Medal Composition for USA

Approach 2: Classifier
Our second approach aims to forecast not only whether an athlete will win a medal but also predict the specific type—Gold, Silver, or Bronze—based on a range of influential factors. Our approach utilizes a combination of historical data from previous Olympics and a sophisticated machine learning model. We've structured our model to consider athlete attributes such as age, sex, BMI, and historical performance metrics across various sports and events. Additionally, we incorporated external factors like GDP and population data to capture broader socio-economic influences on athletic performance. To understand which factors most significantly influence our predictions, we conducted feature importance analysis. This analysis highlighted that factors such as age, BMI and socio-economic indicators played pivotal roles in determining medal outcomes.
Feature Importances from Random Forest

Our model achieved an impressive accuracy of 89% in predicting the correct medal type across various sports and events.

The confusion matrix below illustrates the model's performance in detail. It visually depicts the distribution of predicted versus actual medal outcomes, providing insights into where our model excels and areas for potential refinement.
Confusion Matrix for Medal Predictions
