Course Syllabus

Instructor:

Professor Chandrajit Bajaj

  • Lecture Hours -- Mon, Wed- 9:30 -11:00 am;  PAR 101
  • Office hours – Tue, Thu 2:00 p.m. - 3:00 p.m. POB 2.324A or by appointment
  • Contact: bajaj@cs.utexas.edu

NOTE: Most questions should be submitted to Canvas rather than by sending emails to the instructor. Please attempt to make reservation a day before for the office hour  to avoid conflicts. 

 

Teaching Assistant

Bo Sun

  • Office hours – Fri. 3pm-4pm, TA Station 3
  • Contact: bosun@cs.utexas.edu

Note: Please attempt to make reservations a day before for the office hours  to avoid conflicts. 

Course Motivation and Synopsis

 This course is on fundamental algorithmic, computational aspects of data sciences, machine (deep) learning and statistical inference analysis. The topics spans scalable data analysis and geometric optimization, while  weaving  together discrete and continuous mathematics, computer science and statistics. Students shall delve with breadth-and-depth into dimensionality, sparsity, resolution, resolvability, recovery, prediction, for a variety of   data (sequence, stream, graph-based,  time-series, images, video, hyper-spectral), emanating from multiple sensors (big and small, slow and fast), and accumulated via the interactive WWW.  Issues of measurement errors, noise and outliers shall be central to bounding the precision, bias and accuracy of the data analysis. The geometric insight and characterization gained provides the basis  for  designing and improving existing approximation algorithms for NP - hard problems with better accuracy / speed tradeoffs.

 An initial listing of lecture topics  is given in the syllabus below. This is subject to modification, given the background and speed at which we cover ground.  Homework exercises shall be given almost  bi-weekly.  Assignment solutions that are turned in late shall suffer a  10% per day reduction in credit, and a 100% reduction once solutions are posted. There will be a mid-term exam in class. The content will be similar to the homework exercises. A list of  topics will also be assigned as individual (or pair - group ) data science projects with a written/oral presentation, at the end of the semester. This project shall  be graded, and be in lieu of a final.

The course is aimed at graduate students. Those in the 5-year master's program students, especially in the CS, CSEM, ECE, STAT and MATH. are welcome. You’ll need mathematics and statistics at the level of first year graduate, plus linear algebra, geometry, plus introductory functional analysis and numeric optimization  (e.g., for  CS and ECE students) and combinatorial optimization (e.g.,for  CSEM and Math. students).  

Course Material.

  1. [B1] Chandrajit Bajaj (frequently updated)  A Mathematical Primer for Computational Data Sciences 
  2. [BHK] Avrim Blum, John Hopcroft and Ravindran Kannan. Foundations of Data Science 
  3. [CVX] Stephen Boyd, Lieven Vandenberghe. Convex Optimization .
  4. [ZLLS] Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola  Dive into Deep Learning, v0.7.1
  5. [JK] Prateek Jain, Purshottam Kar Non-Convex Optimization for Machine Learning .
  6. [MU] Michael Mitzenmacher, Eli Upfal Probability and Computing (Randomized Algorithms and Probabilistic Analysis)
  7. [SD] Shai Shalev-Shwartz, Shai Ben-David Understanding Machine Learning, From Theory to Algorithms
  8. [GBC] Ian Goodfellow, Yoshua Bengio, Aaron Courville Deep Learning, 2016
  9. [HF] Hermann Flaschka  Principles of Analysis
  10. [KW19] Diederik P. Kingsma and Max Welling An Introduction to Variational Auto Encoders
  11. Extra reference materials .

 

TENTATIVE  COURSE OUTLINE (in Flux). 

Date Topic Reading Assignments

Wed

01-22-2020

1. Introduction to Data Science, Geometry of Data, High Dimensional Spaces   [notes]

Perceptrons and Deep Learning, Models, Applications [notes] [DL-models]

 [BHK] Ch 1, 12-Appendix

[B1] Ch 1,2

[CVX] [MRT] Appendix

[ZLLS] Introduction

 

Mon

01-27-2020

2. Learning LInear Regression Models, Applications  [notes]

Geometry of Vector, Matrix, Functional Norms  and Approximations [notes] 

[HF] Sec 1,2,3

[BHK] Ch 12-Appendix

[ZLLS] Preliminaries Chap 2, LNN Chap 3

[A1] out today

due before 02-12-2020, 11:59pm

Wed

01-29-2020

3. Probability Theory, Noisy Regression, Bayes Thm , Maximum Likelihood[notes]

Softmax and Cross Entropy Loss Functions. Linear Neural Networks [notes]

[MU] Ch 1 -4

[B1] Appendix

[ZLLS] LNN Chap 3

Mon

02-03-2020

4. Markov, Chebyshev, Chernoff Bounds  [notes] 

Naive Bayes Classifier [ZLLS Appendix ]

[MRT] Chap 1, 2

[MU] . Chap 1

 

Wed

02-05-2020

5. Multi-Layer (non-linear) Perceptron Learning [ZLLS Chap4]

Convex and Non-Convex Optimization [notes]

[ZLLS Chap4]

CVX [Chap1-5]

 

 

Mon

02-10-2020

6.  MonteCarlo Sampling [notes] Low Discrepancy Quasi-Monte Carlo Sampling, [slides], Integration Error H-K Inequality [notes]

See Refs in Notes in Slides

 

 

Wed

02-12-2020

7.  Sampling Multivariate Gaussians in High Dimensions, Separating Mixture of Gaussians I [notes]

[BHK Chap 2, Appendix]

See References in Notes

[A1] solution 

 [A2] out today

due before 02-26-2020, 11:59pm

Mon

02-17-2020

8. Statistical Machine Learning I : Separating Mixture of Gaussians II, Expectation Maximization   [notes]

[BHK Chap2,3]

See Refs in Notes

Wed

02-19-2020

9. Random Projections and Johnson-Lindenstrauss [notes]  

Expectation Maximization II (Latent Variable Models, Soft Clustering, Mixed Regression)  [notes]

[BHK Chap 3]

[CVX] Chap 1, 2, 3

See Refs in Notes

 

Mon

02-24-2020

10. Low Rank Matrix  Approximation with Applications [notes]

Convex and Non-Convex Optimization [notes]

 

[B2]  Ch 5

[CVX] Ch 5, Ch 8

 

Wed

02-26-2020

11.  Matrix Sampling, Matrix Sketching  Algorithms, [notes] 

[CVX] Ch 5, Ch 8

[B2]  Ch 5

[A2] solution

 [A3] out today

due before 03-09-2020, 11:59pm

Mon

03-02-2020

12. Deep Autoencoders I : Learning Latent Variable Space [notes]

Modern Convolutional Neural Networks I  [notes]

[KW19]

[ZLLS Chap 6]

 

 

 

Wed

03-04-2020

 13.  Variational Inference: Deep Variational Autoencoders II  [notes]

[K17]

See References in notes

 

Mon

03-09-2020

14: 

Spectral Methods for Learning: PCA, Kernel PCA,  [notes] 

Multi-Hidden Layer Perceptron Networks [notes]

See References in notes

[A3]  solution 

 

Wed

03-11-2020

 MIDTERM in Class

 

[A4] out today

due before 03-30-2020, 11:59pm

Mon-Fri

03-16-2020-

03-29-2020

 

Spring Break- March 16-29, 2020

 

 

Mon

03-30-2020

15.Convex and Non-Convex Optimization [notes]

Lagrange Multipliers, Lagrangian Dual [notes] 

 

[CVX]

[ZLLS]

 

Wed

04-01-2020

 16.  Spectral Methods for Learning : Fischer LDA, KDA [notes]

Connectiions to Variational AutoEncoders [notes]

See References in Notes

 

Mon

04-06-2020

17.  Non-Convex Optimization - Convex Relaxations [notes]

Compressive Sensing and Tensor Sketching  [notes]

 

 

Final PROJECT 

Part 1: First Report due before 04-20-2020, 9:59pm

 

Wed

04-08-2020

18.Robust Sparse Recovery; Alternating Minimization  [notes2]

[JK]  Ch 3,4

[A4 due by Apr 9]

[A4]  solution 

 

Mon

04-13-2020

 19. Energy Based Optimization Loss Functions [notes]

VC-Dimension and PAC learning revisited [notes]

 

 

 

Wed

04-15-2020

20.  Energy Based Optimization Loss Functions II [notes]

Stochastic Gradient Descent-- Simulated Annealing, Fockker-Planck [notes]

 

Mon

04-20-2020

21. Modern Convolutional Neural Networks II : Back Propagation [notes] 

Modern CNN - Residual Networks  : Adjoint Method [notes] 

 

Connections to Deep Encoder-Decoder Networks [notes]

 

[CVX] Ch 7

Part 1 of Project Due

Final Project Report Due May 09, 2020

Wed

04-22-2020

22. Stochastic Optimization   [notes]

Supervised. 'Soft ' Classification:  SVM,KSVM [notes]

Geometry of Unsupervised Soft Clustering I:   Spectral and Normalized Cut  [notes]

[JK]  Ch 5

 

 

 

Mon

04-27-2020

 23.  Geometry of Unsupervised Clustering II:   Variational Inference [notes]

Connections to Deep Variational Auto-Encoders

[JK]  Ch 5

Wed

04-29-2020

24. Geometry of Deep Learning :  Recurrent Neural Networks, Transformers, Attention [notes]

 

 

Mon

05-04-2020

 

25. Geometry of Deep Learning:  Generative Adversarial Networks [notes]

[BHK] Ch 5

See Refs in Notes

 

Wed

05-06-2020

26.  Geometry of Deep Learning  : Reinforcement Learning RL 1  [notes]

 

[BHK] Ch 5

[GBC] Chap 6,9

 

 

 

27. Geometry of Deep Learning  RL & Value Iterations  II  [notes]

[GBC]  Chap 7-8

See also Ref

 

28. Geometry of Deep Learning IV: RL & Policy Gradients  III [notes] [GBC]  Chap 10-12

FRI

05-08-2020

Presentations TBD

 

Part 2: Final Report due before 05-09-2020, 11:59pm

Final Project Report Due on May 15

 

Project FAQ

1. How long should the project report be?

Answer: See directions in the Class Project List.  For full points, please address each of the evaluation questions as succinctly as possible. Note the deadline for the report is May 11 midnight. You will get feedback on your presentations,  that should also be incorporated in your final report.

Tests

There will be one in-class midterm exam and one final project. The important deadline dates are:

  • Midterm: Wednesday, March 11, 9:30am - 11:00am , PAR 101
  • Final Project  Written Report, Due: May 15, 11:59pm

 

Assignments

There will be four written HW assignments and one final project report. Please refer to the above schedule for assignments and final project report due time.

Course Requirements and Grading

Grades will be based on these factors

  • In-class attendance and participation (5%)
  • HW assignments (44% and with potential to get extra credit) 

4 assignments. Some assignments may have extra questions for extra points you can earn. (They will be specified in the assignment sheet each time.)

  • In-class midterm exam (16%) 
  • First Presentation & Report (10%)
  • Final Presentation & Report (25%)  

Students with Disabilities. Students with disabilities may request appropriate academic accommodations from the Division of Diversity and Community Engagement, Services for Students with Disabilities, 471-6259, http://www.utexas.edu/diversity/ddce/ssd . 

 

Accommodations for Religious Holidays. By UT Austin policy, you must notify the instructor of your pending absence at least fourteen days prior to the date of observance of a religious holiday. If you must miss a class or an examination in order to observe a religious holiday, you will be given an opportunity to complete the missed work within a reasonable time before or after the absence, provided proper notification is given.

 

Statement on Scholastic Dishonesty. Anyone who violates the rules for the HW assignments or who cheats in in-class tests or the final exam is in danger of receiving an F for the course. Additional penalties may be levied by the Computer Science department,  CSEM  and the University. See http://www.cs.utexas.edu/academics/conduct/

Course Summary:

Date Details Due