17-445: Introduction and Motivation

Instructors

Eunsuk Kang, Christian Kaestner, Vaithyanathan Narayanan

< brief introductions >

Communication

Email to us, preferably se-ai@lists.andrew.cmu.edu

Announcements through canvas

No fixed office hours, but will stick around after lecture and recitation. Email us for extra meetings.

Welcome to ask questions publicly on Canvas.

Materials on GitHub. Pull requests encouraged!

Software engineering class

Focused on engineering judgment
Arguments, tradeoffs, and justification, rather than single correct answer
"it depends..."
Practical engagement, building systems, testing, automation
Strong teamwork component
Not focused on formal guarantees or machine learning fundamentals (modeling, statistics)

Prerequisites

Some machine-learning experience required

Basic understanding of data science process, incl data cleaning, feature engineering, learning
High level understand of machine-learning approaches
- supervised learning
- regression, decision trees, neural networks
- accuracy, recall, precision, ROC curve
Ideally some experience with notebooks and sklearn or other frameworks

No software-engineering knowledge required

Basic programming skills will be useful
Teamwork experience in product team is useful but not required
No required exposure to requirements, software testing, software design, continuous integration, containers, process management, etc
- if you are familiar with these, there will be some redundancy -- sorry

Active lecture

Case study driven
Discussion highly encouraged
Contribute own experience
Regular active in-class exercises
In-class presentation
Discussions over definitions

Textbook

Building Intelligent Systems: A Guide to Machine Learning Engineering

by Geoff Hulten

https://www.buildingintelligentsystems.com/

Most chapters assigned at some point in the semester

Supplemented with research articles, blog posts, videos, podcasts, ...

Electronic version in the library

Building intelligent systems book

Readings and Quizzes

Reading assignments for most lectures
- Preparing in-class discussions
- Background material, case descriptions, possibly also podcast, video, wikipedia
- Complement with own research
Short and easy online quizzes on readings, with partner, due before start of lecture
Planned for: about 30-45 min for reading, 15 min for discussing and answering quiz

Assignments

Series of 6 small to medium-sized individual assignments (mostly in first half)
- engage with practical challenges
- analyze risks, fairness
- reason about tradeoffs and justify your decisions
- mostly written reports, a little modeling, limited coding
- Pandemic option: may be done with partner
Large team project with 4 milestones (mostly in second half)
- Build and deploy prediction service
- Testing in production
- Monitoring
- Final presentation
Usually due Wednesday night, see schedule.

Recitations

Typically hands on exercises, use tools, analyze cases

Often designed to prepare for assignments

First recitation on Friday: remote work and collaboration + Git

Grading

40% individual assignment
30% group project with final presentation
10% midterm
10% participation
10% reading quizzes
no final exam
expected grade cutoffs: 81-90% B, 91-100% A

Grading Philosophy

Specification grading, based in adult learning theory
Giving you choices in what to work on or how to prioritize your work
We are making every effort to be clear about expectations (specifications)
Assignments broken down into expectations with point values, each graded pass/fail
You should be able to tell what grade you will get for an assignment when you submit it, depending on what work you chose to do

[Example]

Participation

Participation is important
- Participation in in-class discussions
- Active participation in recitations
- Alternative arrangements if you cannot attend classes live
Participation != Attendance
Grading:
- 100%: Participates at least once in most lectures through chat or audio, or
- 100%: Participates in 25% of lectures and actively contributes to discussions in most recitations
- 90%: Participates at least once in over half of the lectures
- 70%: Participates at least once in 25% of the lectures
- 40%: Participates at least once in at least 3 lectures or recitations.
- 0%: No participation in the entire semester.

Flexibility and Accommodations

(details in syllabus)

7 tokens per student:
- Submit individual assignment 1 day late for 1 token (after running out of tokens 15% penalty per late day)
- Redo individual assignment for 3 token
- Resubmit or submit reading quiz late for 1 token
- Remaining tokens count toward participation
7 tokens per team:
- Submit milestone 1 day late for 1 token (no late submissions accepted when out of tokens)
- Redo milestone for 3 token
Exceptions and accommodations on request, email us.

Teamwork

Teams stay together for project throughout semester, starting next week
Please fill out survey after class
Some advice in lecture + we'll help with debugging team issues
Peer grading on all milestones (based on citizenship on team)

Academic honesty

See web page

In a nutshell: do not copy, do not lie, do not share or publicly release your solutions

In group work, be honest about contributions of team members, do not cover for others

If you feel overwhelmed or stressed, please come and talk to us (see syllabus for other support opportunities)

Class Overview

Aside: AI vs ML

Artificial intelligence is an umbrella term covering symbolic AI (problem solving, reasoning) as well as machine learning (statistical learning from data)
This course focuses mostly on statistical machine learning and supervised learning (extrapolating from data, inductive reasoning)
We will cover symbolic AI (expert systems, probabilistic reasoning, ...) selectively, often for contrast