Thomas Draths
An aspiring data scientist | Springboard alum | Former VP of Sales
Education
- Certificate in Data Science, Springboard, 2021
- Certificate, Business Essentials, Univeristy College, London, 2019
- Bachelor Degree of Political Science, University of Notre Dame, 2003-2007
Projects
Medium Post
This project is the second attempt at predicting soccer team strength using transfer data. It builds upon the work in the first verion with noticeable improvements. - Expanded the dataset to include team data from nine top European soccer leagues.
- Used FuzzyWuzzy to standardize team & league names, as opposed to manually replacing using dictionary values.
- Tested multiple regressors with the data, selecting RandomForest as the best suited for the dataset and saw an improvement in R2 scores of about 50% over the previous version of this project.
- Utilized SHAP values to explain impact of each feature on predictions
Medium Post
My second larger ML project, I used XGBoost to predict employee attrition at a fictional company.
- I began with a very clean dataset, and so focused on feature selection and hypertuning ML algorithms to find the best fit for the data.
- Evaluated algorithm selection using ROC-AUC scores given the imbalances in the data
- Used SHAP values to explain feature impact on predictions
Medium Post
My first ML project and first capstone for the Springboard Data Science program
- As it was my first project, my efforts were highly focused on cleaning the dataset, feature creation and choosing an algorithm.
- Identified actionable next steps for a future version of the project, including better algorithm selection, hyperparameter tuning, and improved feature investigation.
Simple Web Apps
A classifier using the Palmer Penguin dataset developed by Allison Horst in R.
- Use the ‘User Input Features’ menu in the browser window to select your penguin and watch the classifier predict the penguin type.
A classifier using the Iris Dataset
- Use the left-hand column to select your flower features and watch the classifier predict the flower type.
Non-Deployed Apps
View the Repository
Simple .py files that demonstrate how much can be done in just a few lines of code.