Exploratory data analysis
EDA with Tableau
5000 movies from the IMDB database
Pima Indians Diabetes Database
My kernels on Kaggle
A simple linear regression on the Ames housing dataset (see also the EDA above), a random forest model, and a support vector regression model.
A linear model with lots of feature engineering for the Prudential Life Insurance challenge (worth a silver medal in the kernels category).
Extracting plot keywords and genres from the 5000 IMDB movie dataset.
Open data
I published a dataset of star-cluster simulations for your plotting pleasure. More is coming!