- About This Course🧐
- Communication 💬
- Technology 💻
- Course Structure 🍎
- Exams 🧪
- Policies ✏️
- Support 🤝
- Acknowledgements 🙏
- How do we know if an avocado is going to be ripe before we eat it?
- How do we predict the salary of a future data scientist?
- How do we teach a computer to read handwriting?
The world is increasingly recognizing the value of data in solving complex and open-ended problems such as these. Instead of explicitly telling the computer exactly how to differentiate between the letters of the alphabet, we instead give the computer many examples of each letter and let it learn the differences automatically. Similarly, by identifying patterns in data, we can learn which factors combine to make an avocado ready-to-eat or a person likely to be a successful data scientist. The explosive growth of data science is largely due to the fact that this approach of learning from data often works remarkably well.
But how do we learn from data? This is the central question of DSC 40A. We will see that virtually every rigorous learning method involves two steps: 1) turning the abstract problem of learning into a concrete math problem; and 2) solving that math problem. This quarter, we will see how to apply this fundamental approach in a variety of contexts. After this class, you will understand the basic theoretical principles underlying almost every machine learning and data science method — from simple linear regression to deep neural networks. You’ll also be better prepared to tackle the math you’ll see in your upper-division courses, like vector calculus, linear algebra, and probability.
This quarter, we’ll be using Campuswire as our course message board. You will be added to Campuswire automatically; email Janine right away if you’re not able to access it, as we’ll be making all course announcements through it.
If you have a question about anything to do with the course — if you’re stuck on a problem, didn’t understand something from lecture, want clarification on course logistics, or just have a general question about data science — you can make a post on Campuswire. We only ask that if your question includes some or all of an answer (even if you’re not sure it’s right), please make your post private so that others cannot see it. You can also post anonymously if you would prefer.
Course staff will regularly check Campuswire and try to answer any questions that you have. You’re also encouraged to answer questions asked by other students. Explaining something is a great way to solidify your understanding of it!
We won’t be using the direct messaging (DM) functionality of Campuswire, nor will we use email to answer questions about the course. Please don’t DM or email staff members, just make a private or public Campuswire post instead!
We will be using several websites this quarter. Here’s what they’re all used for:
- Course Website: where all content will be posted.
- Campuswire: discussion forum for announcements and communication.
- Gradescope: platform for submitting assignments and viewing grades. You should be automatically added to Gradescope; let us know if not.
- Datahub: UCSD’s data science and machine learning platform, for coding in Jupyter notebooks.
- Zoom: for remote course meetings and office hours. You should already have an account through UCSD; see the Zoom guide for more help.
If you will not have reliable access to a computer this quarter, please reach out to us ASAP, as the university may be able to accommodate you.
The learning environment for this course will be “flipped.” This means you’ll be introduced to new content asynchronously, through lecture videos, so that our time together is spent more actively learning and practicing. We’ll use our scheduled class time for further engagement with the material through groupwork, extra practice, office hours for homework help, and exam reviews. When campus resumes in-person instruction, we’ll move all or most of these activities to in-person meetings, though you will still be responsible for watching the lecture videos ahead of time.
To prepare for each week of class, you will need to watch a few lecture videos, where you’ll be introduced to the important concepts that will form the basis of that week’s in-class activities. All lecture resources (videos, slides, code if necessary, and readings) will be posted on the course website; these will be your primary resources in this class, as there is no formal textbook.
Lecture videos include multiple choice questions that we ask you to answer as you watch; these are not required but are strongly encouraged to check your understanding and keep you engaged.
Certain class meetings and discussion sections will be used to facilitate small group work with peers. We will provide a worksheet of problems, which should help reinforce concepts from the lecture videos and prepare you to do that week’s homework assignment. You will complete the worksheet in groups of two to four students.
There will be several groupwork sessions available each day. Just show up at any one of them, and the instructor or TA will pair you with other students. If you want to organize your own group, you may; make sure to all come to the same groupwork session.
Submit your worksheet to Gradescope by 11:59pm on the day of the groupwork. Only one member of each group should submit the worksheet, and they should indicate the names of all group members on Gradescope. Worksheets won’t be graded on correctness, but rather on good-faith effort. Even if you don’t solve any of the problems, you should include some explanation of what you thought about and discussed, so that you can get credit for spending time on the assignment. In order to receive credit, you must work in a group of two to four students for at least 50 minutes, at one of the scheduled groupwork sessions. You may not do the groupwork alone or meet outside of the scheduled sessions.
Aside from groupwork, we’ll use class meeting times for additional opportunities for you to engage with the material. Some class meetings will be extra practice sessions, where you will work on and see solutions to additional problems, many of them homework problems from previous iterations of the course. During exam weeks, we will have class meetings dedicated to exam review and, afterwards, exam solutions. On the days that homework assignments are due, I will host office hours during class time and answer questions on the homework problems. See the Calendar tab of the course webpage for information about what is happening when.
The homework assignments for this class are quite challenging and most students are not able to successfully complete the homework from watching the lecture videos alone. These additional class meetings are likely to be the most valuable part of the course for you, and you are highly encouraged to not only attend, but also actively participate. Even if you don’t have specific questions, you will likely get a lot out of conversing about the material. If you are sick, however, please stay home!
Extra practice sessions, exam prep reviews, and exam solution reviews will be recorded and posted on our course website. Office hours and groupwork sessions will not be recorded.
This class will have weekly homework assignments, which will be due to Gradescope on Fridays at 11:59pm. We will aim to release homework assignments the weekend before.
Homeworks should be written up and turned in by each student individually. You may talk to other students in the class about the problems and discuss solution strategies, but you should not share any written communication. You can tell someone how to do a homework problem, but you cannot show them how to do it. One way to tell if you are respecting this boundary is to ask yourself whether your collaboration could take place over the phone. Additionally, the content of your verbal communication should involve the problem-solving strategy and approach, and you should not directly compare answers with classmates.
Talking through homework problems with other students can be very valuable for many reasons:
- You will connect with other students in the class, which is especially important in a remote format where much of the learning is done in isolation.
- You will learn about someone else’s thought process and learn new ways of solving problems that you may not have thought of.
- You will get practice explaining your ideas, which is a useful life skill, and important for job interviews.
- You will get practice thinking critically about whether someone’s proposed solution actually works, and you will learn how to poke holes in shaky arguments.
As a result of this collaboration policy, students may have similar approaches to problems, but they should not have similarly presented solutions, such as word choice.
For each problem you submit, you should cite your sources by including a list of names of other students with whom you discussed the problem. Instructors do not need to be cited.
We also encourage you to come to instructor and staff office hours for help on homework questions. You may also post homework-related questions on Campuswire, though your questions (and answers) should be about approaches, not answers. If your question includes some or all of an answer (even if you’re not sure it’s right), you must make your post private so that others cannot see it. We are not able to tell you whether your answer is correct.
To get help on assignments and concepts, course staff will be hosting several office hours per week. When in-person instruction resumes, we will hold some office hours remotely and others in-person. See the Calendar tab of the course website for the most up-to-date schedule and instructions.
To summarize, here’s what a typical week will look like in the course (note that there are some deviations from this due to holidays and exams; the most up-to-date deadlines will be on the course homepage):
|Groupwork (during classtime or discussion)||Extra Practice Session (during classtime)||Janine’s Office Hours (during classtime)|
|Finish Lecture Videos||Groupwork due 11:59pm||Homework due 11:59pm|
There will be two midterm exams (not cumulative) and a final exam broken into two separate parts.
- Midterm 1: Wednesday, February 9, in-person during lecture (10am for A00, 11am for B00)
- Midterm 2: Tuesday, March 8, asynchronously for 90 minutes, available for 24 hours
- Final, Part 1: Saturday, March 12, in-person from 8:10-9am
- Final, Part 2: Saturday, March 12, in-person from 9:20-10:50am
The final exam for this course will consist of two parts, which will be graded separately: part one will cover the material of the first midterm, and part two will cover the material of the second midterm. If you do better on either part of the final than the corresponding midterm, then your score on that part will replace your score on the midterm. If you do better on both parts, then both scores can be replaced. This gives you two chances to demonstrate understanding of the course material, once during the quarter and once after the quarter. This also allows you to miss one or both midterms if necessary, and it makes the final exam optional if you’ve taken both midterms. You can take both parts of the final, just one part, or neither. If you are happy with both midterm scores, for example, you don’t need to take the final at all.
At this time, I am planning on having Midterm 1 and both parts of the Final Exam be in-person synchronous exams. I am planning on Midterm 2 being a remote asynchronous exam available for a 24-hour period, for you to take during any 90 minute time window that is convenient. I will update you with further details as the exam dates approach.
Exams should be taken completely alone, with no collaboration or communication with any other person. We may utilize randomization and multiple versions to ensure the integrity of exams.
Here’s how we will compute your grade.
|Exams||50%||MEAN(MAX(Midterm 1, Final Part 1), MAX(Midterm 2, Final Part 2))|
Each student has three slip days to use throughout the quarter. A slip day can be used to extend the deadline of a homework assignment by 24 hours. You can use at most one slip day on any single homework assignment. Slip days can only be used for homework assignments.
Slip days are applied automatically at the end of the quarter, and you don’t need to ask in order to use one. It’s your responsibility to keep track of how many you have left. If you run out of slip days and submit a homework late, it may still be graded so that you’ll see what questions you missed, but the grade will be changed to a zero at the end of the quarter. If you use more than three slip days, we will count the first three late assignments, and any late assignments after that will get zero scores.
Slip days are designed to be a transparent and predictable source of leniency in deadlines. You can use a slip day if you are too busy to complete a homework on its original due date. But slips days are also meant for things like the internet going down at 11:58pm just as you go to submit your homework.
If you have something going on in your life that is impeding your ability to do your classwork on time, please reach out to us as soon as possible so we can work something out.
In addition to providing you with slip days, we will drop your lowest homework and lowest groupwork. This gives you some additional flexibility for unforeseen circumstances.
Students on the waitlist or who join the class late are expected to keep up with the work and submit assignments by the deadlines.
The stated policies will be strictly enforced out of fairness for all students.
You can ask for a regrade on any assignment if you believe that the grader made a mistake. Remember that clarity is a part of your score — if you had the right idea but were unable to clearly communicate it, you may still not deserve full credit. We ask that you please submit your regrade requests within one week of the assignment grade being released; you can submit regrade requests directly on Gradescope.
In the unfortunate circumstance that you become sick, suffer a loss, or otherwise experience a significant setback that is outside of your control, you may be eligible for an Incomplete grade, which allows you to complete the rest of the work at a later time. If you are experiencing challenges due to circumstances outside your control, please contact me ASAP and we can discuss the best course of action. Note that an Incomplete does not allow you to re-do work that has already been completed, only to do work that hasn’t been completed, so it’s best to reach out right away.
In this class, we expect that you will work hard, utilize allowed resources to master the course material, and act with integrity. Learning partially remotely presents new challenges for academic integrity, making it more important than ever to act honorably and make sure that the work you are submitting is reflective of your knowledge and abilities.
The UCSD Policy on Integrity of Scholarship and this syllabus list some of the standards by which you are expected to complete your academic work, but your good ethical judgment is also expected. Ignorance of the rules will not excuse you from any violations.
For this class, the following activities, among others, are considered cheating and are not allowed:
- Sharing written homework solutions with other students, or viewing written homework solutions from another student.
- Looking or asking for answers to homework problems in other texts or sources, including the internet.
- Collaborating on exams, checking answers on exams, or communicating with any other person while taking an exam.
- Using unauthorized resources on homeworks or exams, including solutions from past iterations of this course.
The following activities are examples of things that are allowed in this class:
- Discussing homework problems with classmates and the instructional staff.
- Reading about concepts from lecture in outside texts, including the internet, without looking for answers to specific homework questions. If you accidentally find related material in another source, you must cite the source on your homework and write up your answer without consulting the source. To do otherwise is plagiarism.
Remember, Academic Integrity is about doing your part to act with honesty, trust, fairness, respect, responsibility and courage. If you are suspected of dishonest conduct, you will be reported to the Academic Integrity Office. Violations of the academic integrity policy will result in failing the course, and the Dean of your college may place you on academic probation or suspend or dismiss you from UCSD. Academic integrity violations are serious and the risk is not worth it!
The following is adapted from CSE 160 at the University of Washington.
Grading for this class is not curved in the sense that the average is set at (say) a B+ and half of the class must receive a grade lower than that. If everyone does well and shows mastery of the material, everyone can receive an A (this would be awesome!). If no one does well (this is unlikely), then everyone can receive a C.
Grading for this class is curved in the sense that we do not have a pre-defined mapping from homework and exam scores to a final GPA. There is no pre-determined score (e.g., 90% of all possible points) that earns an A or a B or a C or any other grade. To determine the final grade, we will ask questions like “Did this student master the material?”. With that said, grades will not be any stricter than the standard grading scale (where an A+ is a 97+, A is 93+, A- is 90+, etc). For instance, the threshold for an “A” will never be higher than 93%.
Try your best not to worry about grades, and we’ll reciprocate by being fair. We’re in this together 😎.
OSD works with students with documented disabilities to review documentation and determine reasonable accommodations. Disabilities can occur in these areas: psychological, psychiatric, learning, attention, chronic health, physical, vision, hearing, and acquired brain injuries, and may occur at any time during a student’s college career. We encourage you to contact the OSD as soon as you become aware of a condition that is disabling so that we can work with you.
If you already have accommodations via OSD, please make sure that we receive your Authorization for Accommodation (AFA) letter by the end of Week 3 so that we can make arrangements for accommodations. Share your AFA letter with the instructor and the Data Science OSD Liaison, who can be reached at firstname.lastname@example.org.
We are committed to creating an inclusive learning environment in which individual differences are respected and all students feel comfortable. If you have any suggestions as to how we could create a more inclusive setting, please let us know. We also expect that you, as a student in this course, will honor and respect your classmates, abiding by the UCSD Principles of Community. Please understand that others’ backgrounds, perspectives and experiences may be different than your own, and help us to build an environment where everyone is respected and able to thrive.
Thanks to the following other instructors who have made contributions to this course: Justin Eldridge, Suraj Rampure, Yian Ma, Gal Mishne, and Giorgio Quer. Thanks also to the tutors and TAs who have supported this course!