Thanks for visiting!

This site is intended to help explain who I am and who I want to be.

A little bit about me

My name is David(Dave) P. Van Anda. I’m currently based in New Jersey and am a graduate student at Indiana University Bloomington studying Data Science with a particular interest in Machine Learning, Network Science, and Complex Systems.

I’m a Research Fellow at IU and working on a project at the Kelley School of Business. My role is modeling and analyzing social contagion in corporate/professional environments. I’m also working on a project that is using network science to evaluate the effectiveness of transaction strategies of professional sports franchises. By the end of the summer, I’ll be sharing an interactive visualization of this transaction network.

I also work full-time at a textile manufacturer where I program knitting machines. I’m currently working on rebuilding my Java image processing application in Python. It’s much faster and more comprehensive. I’ve just about programmed myself out of a job.

Graduate Coursework

Applied Machine Learning I526 with Dr. James Shanahan – Logistic Regression and regularization. Decision trees and pruning, implementation of decision trees. Support vector machines and making them work in practice. Boosting – implementing different boosting methods with decision trees. Using the algorithms for several tasks – how to set up the problem, debug, select features and develop the learning algorithm. Unsupervised learning – k-means, PCA, hierarchical clustering. Implementing the clustering algorithms. Parallelizing the learning algorithms. 

Network Science I606 with Dr. Santo Fortunato – Models and algorithms used in network science. Programming for the analysis of networks of various types and for simulating the dynamics of processes running on them, like epidemic spreading and opinion dynamic

Data Visualization DS590 with Dr. YY Ahn – Understand, explain, and manipulate different types of data, analyze them by applying exploratory visualization techniques, and create explanatory web-based visualizations. Evaluate the effectiveness of data visualizations based on the principles of human perception, design, types of data, and visualization techniques.

Statistics S520 with Dr. Jianyu Wang – Discrete and continuous random variables, estimation, hypothesis testing, 1- and 2-sample location problems, ANOVA, and linear regression

MOOCs and Certificates

  • Neural Networks and Deep Learning – deep learning.ai (Coursera)
  • How Google Does Machine Learning – Google (Coursera)
  • Anatomy: The Life of a Cell – HACC (iTunes U)
  • Introduction to Complexity – Santa Fe Institute
  • Mathematics for Machine Learning – Imperial College of London (Coursera)

Books I’m currently reading

  • The Fall by Albert Camus
  • Brief Answers to Big Questions by Stephen Hawking

What else is on my mind?

  • Data Science PhD programs
  • M-Theory/Branes
  • Self-Similarity
  • Asset Bubbles