Fa23 - GEOMETRIC FNDTNS OF DATA SCI (53050)

Instructor:

Professor Chandrajit Bajaj

  • Lecture Hours -- Mon, Wed- 2:00 - 3:15 pm, GDC 4.302 
  • Office hours – Tuesday  1:00. - 3:00 p.m. or by appointment ( Zoom  or POB 2.324)
  • Contact: bajaj@cs.utexas.edu, bajaj@oden.utexas.edu

NOTE: Please do not send messages (questions or concerns) through Canvas because I rarely don’t check email messages on Canvas. All questions related to class should be posted through Piazza or bring them to the office hour. Here is the link to register for Piazza: You can also join via the Piazza Tab on the Canvas course page

Teaching Assistant

Shubham Bhardwaj

Note: Please attempt to make reservations a day before for office hours to avoid conflicts. 

Course Motivation and Synopsis

This course is on reinforced learning of  the geometric foundations of  data sciences and classical, deep and modern machine learning. In particular we shall dive deep into the mathematical, statistical and computational optimization fundamentals  that are the basis of  computational deep learning frameworks (e.g. classification, clustering, recommendation, prediction, generative)  and Markov decision making processes ( single and multi-player game-playing,  sequential and repeated forecasting).   We shall thus learn how data driven  and continual  learning  is harnessed to learn the Hamiltonian's (governing energetic equations) underlying  dynamical systems, and multi-player games. These latter topics are foundational AI as they lead to the training of multiple neural networks (agents) learning  cooperatively and in adversarial scenarios to help solve any computational problem better.

An initial listing of lecture topics  is given in the syllabus below. This is subject to modification, given the background and speed at which we cover ground.  Homework exercises shall be given almost  bi-weekly.  Assignment solutions that are turned in late shall suffer a  10% per day reduction in credit, and a 100% reduction once solutions are posted. There will be a mid-term exam in class. The content will be similar to the homework exercises. A list of  topics will also be assigned as take home final projects, to train, cross-validate and test the best of  machine learned  decision making agents. The projects will involve ML programming, oral presentation, and a written report submitted at the end of the semester.  This project shall  be graded, and be in lieu of a final exam.

The course is aimed at junior and senior undergraduates  students. Those in the 5-year master's program students, especially in the CS, CSEM, ECE, STAT and MATH. are welcome if they would like to bolster their foundational knowledge. You’ll need algorithms, data structures, numerical methods and programming experience (e.g. Python ) as a CS senior, mathematics and statistics at the level of CS, Math, Stat, ECE, plus linear algebra, computational geometry, plus introductory functional analysis and combinatiorial and numerical optimization (CS, ECE, CSEM , Stat and Math. students). 

Late Policy

For submission 1 day later from deadline  - 25% deduction. For 2 days later - 50% deduction. We will be revealing assignment on 3 day. Therefore 100% deduction on 3rd day.

Course Material.

  1. [B1] Chandrajit Bajaj (frequently updated)  A Mathematical Primer for Computational Data Sciences 
  2. [BHK] Avrim Blum, John Hopcroft and Ravindran Kannan. Foundations of Data Science
  3. [BV] Stephen Boyd and Lieven Vandenberghe Convex Optimization
  4. [B] Christopher Bishop Pattern Recognition and Machine Learning
  5. [M] Kevin Murphy Machine Learning: A Probabilistic Perspective
  6. [SB] Richard Sutton, Andrew Barto Reinforcement Learning
  7. [SD] Shai Shalev-Shwartz, Shai Ben-David Understanding Machine Learning, From Theory to Algorithms
  8. Extra reference materials .

COURSE OUTLINE 

Date Topic Reading Assignments

Mon

08-21-2023

1. Introduction to Data Science, Geometry of Data, High Dimensional Spaces,  Belief Spaces   [Lec1]

[BHK] Ch 1,2

 

 

Wed

01-23-2023

2. Learning High-Dimensional Linear Regression Models  [Lec2]

Geometry of Vector, Matrix, Functional Norms  and Approximations [notes];

Introductory functional analysis [notes];

[SD] Ch 9, Appendix C

[BHK] Chap 12.2,12.3

[A1]with [latex solution template] out today; due by 09-06-2023, 11:59 pm

Mon

08-28-2023

3. Learning Theory and Model Selection [Lec3]

Probability, Information and Probabilistic Inequalities [notes]

[MU] Ch 1-3

[B] Chap 1

Wed

08-30-2023

4. Stochastic Machine Learning |:  Cross, Conditional and Relative Entropy,  [Lec 4]

Log-Sum-Exponential-Stability [notes]

 

[MU] Chap 4, 24.2

[BHK] Chap 12.4,12.6

 

Wed

09-06-2023

5. Probabilistic Distribution Sampling in High Dimensional  Spaces [Lec5]

Concentration of Measure  [notes]

 

[M] Chap 23

[A1] due by midnight.

[A2] will be out on 09-10-2023; due by 09-24-2023, 11:59 pm.

 

Mon

09-11-2023

6. Statistical Machine Learning using MonteCarlo and Quasi-MonteCarlo[Lec6]

 

[M] Chap 24

 

Wed
09-13-2023

7. Quasi-Monte-Carlo Methods, Integration Error H-K Inequality  [notes]

 

[M] Chap 24

 

Mon

09-18-2023

8.  Learning Dynamics I  - Markov Chain Monte Carlo Sampling [Lec7]

MCMC and Bayesian Inference [Notes]

[BHK] Chap 4

[MU] Ch 7, 10

 

 

Wed

09-20-2023

9. Learning Dynamics II - Random Walk  [notes]

 

 

[SD] Chap 24

[BV] Chap 1-5

[A2] due by 09-24-2023, 11:59 pm.

Mon

09-25-2023

10. Convex Optimization for Machine Learning [notes]

 

 

[BHK] Chap 2.7

[SD] Chap 23,24

Wed

09-27-2023

11.  SVM via Stochastic Optimization [notes]

 

[M] Chap 11

[A3] will be out on 09-28-2023; due by 10-12-2023, 11:59 pm.

Mon

10-02-2023

12. Statistical Machine Learning I : Separating Mixture of Gaussians  - Expectation Maximization   [notes]

 

[M]  Ch 2, 5

 

 

Wed

10-04-2023

13. Statistical Machine Learning II: Bayesian Modeling

[notes]

 

[M]  Ch 4

 

 

Mon

10-09-2023

 14.  Statistical Machine Learning III: Bayesian Inference,  Multivariate Gaussians [notes1] [notes]

 

[M]  Ch 15

 

Wed

10-11-2023

15: Statistical Machine Learning IV: Gaussian Processes I [notes]   

 

 

[M]  Ch 15

[A3] due by 10-12-2023, 11:59 pm.

Mon

10-16-2023

16. Statistical Machine Learning V: Non-Gaussian Processes, Conjugate Priors  [notes]

 

Wed

10-18-2023

 MIDTERM in Class

 

 

 

[BHK] Chap 5

Mon

10-23-2023

 16. Learning Dynamics,  Lyapunov Stability  and connections to Training Deep Networks [notes]

 

[M] Chap 14

[A4} will be out today; due by 11-01-2023, 11:59 pm.

Wed

10-25-2023

17. Learning Dynamics :with NeuralODES:   : Resnets, Adjoint Method for BackProp [notes]   Implicit Euler, Convergence [notes]

 

Mon

10-30-2023

18  Learning Stochastic Dynamics (Neural SDE): Ito Processes, Euler-Maruyama [notes]

Energy Based Optimization Loss Functions II: SGD  Adagrad, RMSProp, Adam, ...] [notes]

 

 

 

 

Wed

11-01-2023

 19.  Learning Dynamics with Control and Optimality [notes]

Non-convex Projected Gradient Descent [notes-references]

[A4] due today, i.e., 11-01-2023, 11:59 pm.

Mon

11-06-2023

20. The role of Sensors and Optimal Sensor Fusion:

Basics of Kalman Filters [notes]

Illustrated Kalman Filters [notes]

 

See references cited in notes

 

Project details will be out on 11-07-2023; Part (I) of the project due by 11-26-2023, 11:59 pm.

Wed

11-08-2023

21. Reinforcement Learning II:  Learning with Trajectory (Stochastic) Optimization:  iLQR, ilQG [notes]

Reinforcement Learning I:  Optimal Control, Hamilton-Jacobi-Bellman Optimality Principle, LQR, Closed Form : Algebraic Ricatti  [notes]

 

See references cited in notes

 

Mon

11-13-2023

22. Reinforcement Learning III:  MDP, POMDP, Optimal Control  [notes]

[notes]

See references cited in notes and paper

 

 

 

Wed

11-15-2023

 23. Learning Stochastic Dynamics and Optimal Control:  Dynamics with Stability, LQR [notes]

See references cited in [notes]

Part (I) of the project due by 11-30-2023, 11:59 pm.

Final report and presentation Video due by 12-12-2023; 11:59 pm.

Mon

11-27-2023

24.  Learning Reinforcement Learning with Optimal Conrol  ilQR, iLQG  [notes]

 

 

See references cited in [notes]

[Basar] See Lectures 1, 2, 3 

Wed

11-29-2023

25.Reward Reshaping with Optimal Control [notes]

 [SB]  See Chap 3 

[Basar] See Lectures 1, 2, 3 

Part (I) of the project due by 11-30-2023, 11:59 pm.

Mon

12-04-2023

26.   Principled Reinforcement Learning with Hamiltonian-Dynamics-PMP-OCF  [notes]

 

[Basar] Lectures 8, 9

Final report  and Presentation Video  due by 12-12-2023; 11:59 pm.

Addtl. Material

Non-convex Optimization , Projected Gradient Descent [Notes]

Random Projections,Johnson-Lindenstrauss, Compressive Sensing [notes]  

 

Addtl. Material

Spectral Methods in Dimension Reduction -KPCA [notes

Spectral Methods for Learning : KSVM [Notes], Fischer LDA, KDA [notes]

Addtl. Material

Stochastic Gradient Descent-- Simulated Annealing, Fockker-Planck [notes]

Robust Sparse Recovery; Alternating Minimization  [notes2]

Connections to Variational AutoEncoders [notes

 

 

Geometry of Game Theoretic Learning I :  Actionable Learning [[notes] Nash Equilibrium  [notes]

Geometry of Game Theoretic Learning II: Stackelberg Equilibrium [notes]

 Games & MARL  II [notes]

 

 

 

Project FAQ

1. How long should the project report be?

Answer: See directions in the Class Project List.  For full points, please address each of the evaluation questions as succinctly as possible. Note the deadline for the report is 12-12-2023; 11:59 pm. You will get feedback on your presentations,  that should also be incorporated in your final report.

Assignments, Exam, Final Project

There will be six take-home bi-weekly assignments,  one in-class midterm exam, and one take-home final project (in lieu of a final exam).  The important deadline dates are:

  • Midterm: 16th of October, 2pm - 3:30pm ,  In Class
  • Final Project Written Report, PPT, Part 1, Due: Nov 30, 11:59pm
  • Final Project Written Report, PPM, Part 2, Due: December 12, 11:59pm

 

Assignments

There will be four written take-home HW assignments and one take-home final project report. Please refer to the above schedule for assignments and final project report due time.

Course Requirements and Grading

Grades will be based on these factors:

  • In-class attendance and participation (5%)
  • HW assignments (50% and with potential to get extra credit) 

4 assignments. Some assignments may have extra questions for extra points you can earn. (They will be specified in the assignment sheet each time.)

  • In-class midterm exam (15%) 
  • First Report (10%)
  • Final Presentation Video & Report (20%)  

Students with Disabilities. Students with disabilities may request appropriate academic accommodations from the Division of Diversity and Community Engagement, Services for Students with Disabilities, 471-6259, http://www.utexas.edu/diversity/ddce/ssd . 

 

Accommodations for Religious Holidays. By UT Austin policy, you must notify the instructor of your pending absence at least fourteen days prior to the date of observance of a religious holiday. If you must miss a class or an examination in order to observe a religious holiday, you will be given an opportunity to complete the missed work within a reasonable time before or after the absence, provided proper notification is given.

 

Statement on Scholastic Dishonesty. Anyone who violates the rules for the HW assignments or who cheats in in-class tests or the final exam is in danger of receiving an F for the course. Additional penalties may be levied by the Computer Science department,  CSEM  and the University. See http://www.cs.utexas.edu/academics/conduct/

Public Domain This course content is offered under a Public Domain license. Content in this course can be considered under this license unless otherwise noted.