Intensive 1-year master's degree covering statistics and computer science.
Coursework:
AC209 Data Science \\
CS181 Machine Learning \\
AM207 Stochastic Methods for Data Analysis \\
ST139 Statistical Sleuthing for Linear Models \\
CS205 Computing Foundations for Computational Science \\
CS207 Systems Development for Computational Science \\
AC290R Extreme Computing \\
AC297R Capstone
Minor in Statistics
http://nicodri.github.io/CS109_crunchbase
http://abhishekmalali.github.io/spark-ml
In this project, we contrasted two different questioning schemes to assess efficacy in recovering true class labels from noisy labels provided by crowdsourced non-experts, when there are more than two classes to choose from. We generate data with a confusion matrix for each expert, and an underlying class distribution, and attempt to recover both parameters and the true labels. We used Expectation Maximization (EM), Simulated Annealing, and PyMC to compare efficiency and verify results.
This is a project for the capstone course, where I worked with students from Harvard and Politecnico di Milano to create a web application that connected companies looking to create revolutionary new products with the design community. We designed the product from idea to implementation, contacted real companies, and tested out the platform for a data visualization class. I personally was in charge of the django-based back-end.