Predictive Modeling for Lead Conversion in EdTech

Within the framework of MIT’s Institute for Data, Systems, and Society (IDSS) Data Science and Machine Learning program, I participated in a case study for ExtraaLearn—a fictional early-stage EdTech startup. Focused on the thriving Online Education market, the project aimed to tackle the challenge of efficiently identifying leads with a higher likelihood of converting to paid customers.

Executed in Python, the project commenced with a comprehensive Exploratory Data Analysis (EDA) of the provided dataset, which included diverse attributes such as lead ID, age, current occupation, and details about the lead’s interactions with ExtraaLearn. The focus was on understanding the factors influencing lead conversion. Decision tree and random forest models were employed for predictive modeling, and their performance was fine-tuned through grid search and parameter optimization.

The predictive model successfully identified patterns and factors driving the lead conversion process for ExtraaLearn. Key insights were derived from attributes such as age, current occupation, website interaction details, and digital marketing engagement. This information allowed ExtraaLearn to create a targeted profile of leads likely to convert. The project not only enhanced the company’s lead conversion strategy but also facilitated a more efficient allocation of resources. This initiative stands as a testament to the practical application of data science and machine learning in optimizing business processes within the dynamic landscape of the EdTech industry.

The code for this case study can be found in my Kaggle profile.

Skills & Tools

Data Science, Predictive Models, Machine Learning, Exploratory Data Analysis, Data Visualization, Statistics, Python, Decision Trees, Random Forest