Skip to content

Academy

Welcome to papAI Academy, a comprehensive platform where you can embark on your journey to create your own Data Science projects from start to finish. Our tutorials cover a wide range of real-life use cases, allowing you to explore various aspects and capabilities of the platform. Here's a list of the available tutorials:

Binary Classification

Heart disease use case (~ 50 min)

This binary classification use case aims to enhance the detection of patients who may be at risk of developing cardiovascular disease, utilizing a comprehensive analysis of various factors. Here are the different steps followed in this tutorial :

# Steps Avg Duration
1 Project Creation 2 min
2 Local Data Import 2 min
3 Data Analysis 7 min
4 Data Visualization 5 min
5 Data Cleaning 5 min
6 ML Pipeline & Training 10 min
7 Model Evaluation & Interpretability 5 min
8 Prediction on Test Set 10 min

Multi Classification

Customer Segmentation use case (~ 45 min)

This tutorial presents a Multi classification use case for customer segmentation. It involves dividing a customer base into groups with similar characteristics, enabling businesses to optimize their marketing strategies by delivering personalized messages. The main focus of this tutorial is to classify new clients into four distinct classes (A, B, C, and D) based on attributes such as age, gender, education status, and profession. The following are the different steps followed in this tutorial:

# Steps Avg Duration
1 Project Creation 2 min
2 Local Data Import 2 min
3 Data Visualization 5 min
4 Advanced Visualization 8 min
5 Data pre-processing 8 min
6 Train and test the model 10 min
7 Evaluate and Interpret the model 5 min
8 Test the predictions made by the model 5 min

Regression

Parcoursup use case (~ 50 min)

This Regression project focuses on predicting the acceptance rate for French universities, providing valuable insights for high school students choosing their university. By analyzing factors such as academic performance and proposed courses, students can make informed decisions about their preferred institutions and increase their chances of admission. The project aims to empower students with accurate information for a successful transition to higher education. Here are the different steps followed in this tutorial :

# Steps Avg Duration
1 Project Creation 2 min
2 Local Data Import 2 min
3 Data Cleaning 5 min
4 Data Analysis 5 min
5 Data Visualization 10 min
6 Python & SparkSQL Recipes 10 min
7 ML Pipeline & Training 5 min
8 Model Evaluation 5 min
9 Prediction 5 min

TS Forecasting

Airport traffic use case (~ 40 min)

This TS Forecasting use case is dedicated to accurately predicting the traffic volume of an airport, specifically focusing on the number of passengers per week, and identifying any temporal trends within a given interval. By leveraging advanced forecasting techniques, it aims to provide valuable insights for efficient airport management and planning. Here are the different steps followed in this tutorial :

# Steps Avg Duration
1 Project Creation 3 min
2 Local Data Import 5 min
3 Data Analysis 5 min
4 Data Visualization 5 min
5 Data Cleaning 7 min
6 ML Pipeline & Training 5 min
7 Model Evaluation 5 min
8 Prediction on Test Set 5 min

Uber fare costs and customer demand forecast (~ 1 hour)

In this tutorial, we dive into a comprehensive study that explores the art of estimating Uber ride prices by considering crucial factors like ride distance, demand time, and specific day of the week. Additionally, we tackle the exciting challenge of forecasting hourly customer demand for a given date, leveraging powerful data analysis techniques and TS Forecasting models. Throughout this tutorial, we'll guide you through each step of the process, equipping you with valuable skills and insights.

# Steps Avg Duration
1 Project Creation 2 min
2 Local Data Import 2 min
3 Data Preprocessing 13 min
4 Data Visualization 13 min
5 Train and test the model for Price Prediction 15 min
6 Train and test the model for Hourly Demand Forecast 15 min

Clustering

Customer Relationship Management (CRM) use case (~ 30 min)

This Clustering use case demonstrates the power of client segmentation to extract actionable insights that can greatly benefit a sales team. By effectively grouping customers based on shared characteristics and behaviors, this approach enables the sales team to enhance their prospecting efforts for new customers and improve overall customer satisfaction. Leveraging the insights gained from clustering analysis, the sales team can tailor their strategies and messaging to specific customer segments, leading to more targeted and successful sales initiatives. Here are the different steps followed in this tutorial :

# Steps Avg Duration
1 Project Creation 2 min
2 Local Data Import 2 min
3 Cohort Analysis 5 min
4 Data Visualization 5 min
5 Customer Segmentation 5 min
6 Clustering Pipeline 5 min
7 Model Evaluation 5 min
8 Prediction on Test Set 2 min

Survival analysis

Churn analysis and prediction use case (~ 40 min)

This Survival analysis use case to analyze and predict customer attrition (using binary classification too). Survival analysis goes beyond traditional churn analysis by considering not just whether a customer churns, but also when it's likely to happen. By understanding the key factors that influence churn, we aim to forecast churn probabilities over time and optimize customer retention strategies. Let's explore the power of survival analysis in this tutorial, where we'll follow these steps:

# Steps Avg Duration
1 Project Creation 2 min
2 Local Data Import 2 min
3 Dataset analysis 5 min
4 Data Cleaning 5 min
5 Train and test the model 5 min
6 Churn analysis with survival models 5 min
7 Feature Impact interpretation 5 min
8 Predict Future churns 5 min
9 Model evaluation 3 min
10 Test the prediction made by the model 3 min