# Resources đź“š

## Table of contents

## Lecture Videos

These are the lecture videos, which you should watch asynchronously by the date listed. Weâ€™ll use our scheduled class time for further engagement with the material through groupwork, extra practice, and office hours for homework help.

Video | Watch by | Topics |
---|---|---|

Video 1 | Sunday, January 9 | learning from data, mean absolute error |

Video 2 | Sunday, January 9 | minimizing mean absolute error |

Video 3 | Sunday, January 9 | mean squared error |

Video 4 | Sunday, January 9 | empirical risk minimization, general framework, 0-1 loss |

Video 5 | Sunday, January 16 | UCSD loss |

Video 6 | Sunday, January 16 | gradient descent |

Video 7 | Sunday, January 16 | gradient descent demo, convexity |

Video 8 | Sunday, January 16 | spread |

Video 9 | Sunday, January 23 | linear prediction rule |

Video 10 | Sunday, January 23 | least squares solutions |

Video 11 | Sunday, January 23 | regression interpretation |

Video 12 | Sunday, January 23 | nonlinear trends |

Video 13 | Sunday, January 30 | linear algebra for regression |

Video 14 | Sunday, January 30 | gradient, normal equations |

Video 15 | Sunday, January 30 | polynomial regression, nonlinear trends |

Video 16 | Sunday, January 30 | multiple regression |

Video 17 | Sunday, February 6 | k-means clustering |

Video 18 | Sunday, February 6 | k-means clustering, cost function, practical considerations |

Video 19 | Sunday, February 13 | probability, basic rules |

Video 20 | Sunday, February 13 | conditional probability |

Video 21 | Sunday, February 13 | probability, random sampling, sequences |

Video 22 | Sunday, February 20 | combinatorics, sequences, sets, permutations, combinations |

Video 23 | Sunday, February 20 | counting and probability practice |

Video 24 | Sunday, February 20 | law of total probability, Bayesâ€™ Theorem |

Video 25 | Sunday, February 27 | independence, conditional independence |

Video 26 | Sunday, February 27 | naive Bayes |

Video 27 | Sunday, February 27 | text classification, spam filter, naive Bayes |

## Course Notes

The notes for this class were written by me and Justin Eldridge. These notes cover the material from the first half of the course and align very closely with the material youâ€™ll see in the lecture videos.

## Probability

Unlike the first half of the course, where we had course notes written specifically for this class, we donâ€™t have DSC 40A-specific notes for the second half of the class, because there are many high-quality resources available online that cover the same material. Below, youâ€™ll find links to some of these resources.

### Readings and Sources of Practice Problems

Open Intro Statistics: Sections 2.1, 2.3, and 2.4 cover the probability we are learning in this course at a good level for undergraduates. This is a good substitute for a textbook, similar to the course notes that we had for the first part of the course. It goes through the definitions, terminology, probability rules, and how to use them. Itâ€™s succinct and highlights the most important things.

Probability for Data Science: Chapters 1 and 2 of this book have a lot of good examples demonstrating some standard problem-solving techniques. This book should be primarily useful for more problems to practice and learn from. This book is written at a good level for students in this class. It is used at UC Berkeley in their Probability for Data Science course. Our course only really covers material from the first two chapters, but if you want to extend your learning of probability as it applies to data science, this is a good book to help you do that.

Theory Meets Data: Chapters 1 and 2 of this book cover similar content to Chapters 1 and 2 of the Probability for Data Science book, but with different prose and examples. It is used at UC Berkeley for a more introductory Probability for Data Science course.

Grinstead and Snellâ€™s Introduction to Probability: Chapters 1, 3, and 4.1 of this book cover the material from our class. This book is a lot longer and more detailed than the others, and it uses more formal mathematical notation. It should give you a very thorough understanding of probability and combinatorics, but it is a lot more detailed, so the more abbreviated resources above will likely be more useful. With that said, this book is written at a good level for undergraduates and is used in other undergraduate probability classes at UCSD, such as CSE 103.

Introduction to Mathematical Thinking: This course covers topics in discrete math, some of which are relevant to us (in particular, set theory and counting). In addition to the lecture videos linked on the homepage, you may want to look at the notes section.

Khan Academy: Counting, Permutations, and Combinations: Khan Academy has a good unit called Counting, Permutations, and Combinations that should be pretty helpful for the combinatorics we are learning in this class. A useful aspect of it is the practice questions that combine permutations and combinations. Most students find that the hardest part of these counting problems is knowing when to use permutations and when to use combinations. These practice questions have them mixed together, so you really get practice learning which is the right technique to apply to which situation.

### Probability Roadmap

I wrote a â€śProbability Roadmapâ€ť that aims to guide students through the process of solving probability problems. I hope youâ€™ll find it useful! It comes in three versions:

- Examples: This document consists of strategies followed by example problems that employ those strategies. If youâ€™re looking to gain additional practice, start here.
- Solutions: This document contains solutions and explanations for all of the example problems in the first document. After youâ€™ve attempted the problems on your own, read through this full document. Even if youâ€™ve solved all the questions, youâ€™re likely to learn how to do some problems in new ways.
- Summary: This document is a concise summary and contains only the strategies themselves.

### Visualizations

## Past Exams

Below, youâ€™ll find some exams (and in some cases, their solutions) from previous offerings of the course. You must be logged into your @ucsd.edu Google account to access these.

Some things to keep in mind:

- Certain offerings of the course had one midterm and others had two. Usually, Midterm 1 covered empirical risk minimization, and Midterm 2 covered probability.
- Topic coverage and ordering has changed over time, so the content in our exams wonâ€™t necessarily exactly match the content of these past exams.
- Some of these exams were given as closed-book exams and others allowed the use of resources.

Quarter | Instructor(s) | Midterm/Midterm 1 | Midterm 2 | Final |
---|---|---|---|---|

Fall 2021 | Suraj Rampure | Blank, Solutions | â€“ | Blank, Solutions |

Spring 2021 | Janine Tiefenbruck | Blank, Solutions, Videos đźŽ¬ | â€“ | Part 1: Blank, Solutions |

Winter 2021 | Gal Mishne | Blank, Solutions | Blank, Solutions | Part 1: Solutions Part 2: Solutions |

Fall 2020 | Janine Tiefenbruck, Yian Ma | Blank, Solutions | â€“ | Part 1: Blank, Solutions |

Spring 2020 | Janine Tiefenbruck | Blank, Videos đźŽ¬ | â€“ | Part 1: Blank |

Winter 2020 | Justin Eldridge | Solutions | Solutions | Solutions |

## Other Resources

- Other lectures on Loss Functions and Simple Linear Regression.
- These are from a different course for a different audience, and use different notation and terminology. However, the high-level ideas are similar to those in the first few weeks of our course.

- Gradient Descent visualizer.

If you find another helpful resource, let us know and we can link it here!