Introducation of Machine Learning


Welcome to my Data Science with Python course!

You can find all the Jupyter notebook on my Github page here.

Course Objectives

This class aims to provide social science students with a better understanding of the theoretical concepts of machine learning technique, and also the practical implementations of these methods to answer questions in the real world. The class will cover topics in regression, classification, ensemble methods, clustering, neural networks, etc. Students will also gain hands-on skills in software implement with r or python.

For the class format, concepts will be first introduced through assigned readings and short videos. In in-class sessions, I will help students summarize major ideas and put key concepts into practice. Lab session and homework will be provided to help students practice and sharp their coding skills. I hope this course can lay a foundation for students who interested in machine learning techniques.


  • Introduction to Machine Learning
    • The origins of machine learning
    • Types of machine learning algorithms
    • Understanding data
  • Data processing
    • Python basics
  • Evaluating Model Performance
    • Confusion matrices
    • Sensitivity and specificity
    • Precision and recall
    • Cross-validation
  • Lazing learning - Classification using nearest neighbors
    • Understanding nearest neighbor classification
    • Measuring similarity with distance
    • Data preparing for KNN
    • Code example-mnist handwritten digit
  • Probabilistic Learning – Classification Using Naive Bayes
    • Understanding Naive Bayes
    • Classification with naive Bayes
  • Divide and Conquer – Classification Using Decision Trees and Rules
    • Understanding decision trees
    • Understanding classification rules
    • Pruning the decision tree
    • Random Forest
  • Artificial Neural Network (ANN)
    • Understanding neural networks
    • Activation functions
    • Network topology
    • Training neural networks with backpropagation
  • Support Vector Machines
    • Classification with hyperplanes
    • Using kernels for nonlinear spaces
  • Finding Patterns – Market Basket Analysis Using Association Rules
    • Understanding association rules
  • Finding Groups of Data – clustering with k-means
    • Understanding clustering
    • The k-means clustering algorithm