Predicting non-functional water pumps in Tanzania
Data: Competition for drivendata.org
Techniques: Classification, random forest, PCA
Blog
Data: Competition for drivendata.org
Techniques: Classification, random forest, PCA
Data: Wine sales data
Techniques: k-means clustering, recommender systems, matrix factorization
Data: Telecom customer data
Techniques: Calculating churn probability and expected loss, random forest
Data: Student test scores
Techniques: Bayesian analysis, hypothesis testing, MCMC
Data: Records of consumers selecting different candy brands.
Techniques: Feature selection, regression
Data: Income data from the UCI ML Repository
Techniques: Classification, random forest
Data: Monthly armed robberies
Techniques: ARIMA models, stationarity tests, differencing
Data: Time series of mining accidents
Techniques: Bayesian analysis, MCMC
Data: Boston housing dataset
Techniques: Gradient boosted regression trees
Data: Synthetic data
Techniques: MCMC
Data: RNASeq data
Techniques: Multiple comparison testing, PCA, clustering
Data: Twitter API
Techniques: NLP, logistic regression, latent Dirichlet allocation, scraping
Data: Twitter API
Techniques: NLP, sentiment analysis with various models, scraping
Data: Accident data in NYC
Techniques: Exploratory analysis