Eunsuk Kang & Christian Kaestner
If you can hear me, open the participant panel in Zoom and check "yes"
Expect:
Talk to us about accommodations of any kind
Discussions and interactions are important. We'll have regular in-class discussions and exercises
This is hard. We know.
This is a fairly new class in a rapidly evolving field.
Many experiments for online teaching.
We are software engineers.
and Domain specialists + Operators + Business team + Project managers + Designers, UI Experts + Safety, security specialists + Lawyers + Social scientists + ...
Take audio or video files and produce text.
State of the art: Manual transcription, often mechanical turk (1.5 $/min)
PhD research on domain-specific speech recognition, that can detect technical jargon
DNN trained on public PBS interviews + transfer learning on smaller manually annotated domain-specific corpus
Research has shown amazing accuracy for talks in medicine, poverty and inequality research, and talks at Ruby programming conferences; published at top conferences
Idea: Let's commercialize the software and sell to academics and conference organizers
17-445/17-645, Fall 2020, 12 units
Monday/Wednesdays 1:30-2:50pm EDT, here on Zoom
Recitation Fridays 9:50-11:10am, EDT, on Zoom
Eunsuk Kang, Christian Kaestner, Vaithyanathan Narayanan
< brief introductions >
Email to us, preferably se-ai@lists.andrew.cmu.edu
Announcements through canvas
No fixed office hours, but will stick around after lecture and recitation. Email us for extra meetings.
Welcome to ask questions publicly on Canvas.
Materials on GitHub. Pull requests encouraged!
Some machine-learning experience required
No software-engineering knowledge required
Building Intelligent Systems: A Guide to Machine Learning Engineering
by Geoff Hulten
https://www.buildingintelligentsystems.com/
Most chapters assigned at some point in the semester
Supplemented with research articles, blog posts, videos, podcasts, ...
Electronic version in the library
Series of 6 small to medium-sized individual assignments (mostly in first half)
Large team project with 4 milestones (mostly in second half)
Usually due Wednesday night, see schedule.
Typically hands on exercises, use tools, analyze cases
Often designed to prepare for assignments
First recitation on Friday: remote work and collaboration + Git
Participation is important
Participation != Attendance
Grading:
(details in syllabus)
7 tokens per student:
7 tokens per team:
Exceptions and accommodations on request, email us.
See web page
In a nutshell: do not copy, do not lie, do not share or publicly release your solutions
In group work, be honest about contributions of team members, do not cover for others
If you feel overwhelmed or stressed, please come and talk to us (see syllabus for other support opportunities)
Let's go around the "room" for introductions:
Algorithms.shortestDistance(g, "Tom", "Anne");
> ArrayOutOfBoundsException
Algorithms.shortestDistance(g, "Tom", "Anne");
> -1
class Algorithms {
/**
* This method finds the shortest distance between to
* verticies. It returns -1 if the two nodes are not
* connected.
*/
int shortestDistance(…) {…}
}
class Algorithms {
/**
* This method finds the shortest distance between to
* verticies. Method is only supported
* for connected verticies.
*/
int shortestDistance(…) {…}
}
/*@ requires amount >= 0;
ensures balance == \old(balance)-amount &&
\result == balance;
@*/
public int debit(int amount) {
...
}
(JML specification in Java, pre- and postconditions)
/**
* Calls the <code>read(byte[], int, int)</code> overloaded [...]
* @param buf The buffer to read bytes into
* @return The value retured from <code>in.read(byte[], int, int)</code>
* @exception IOException If an error occurs
*/
public int read(byte[] buf) throws IOException
{
return read(buf, 0, buf.length);
}
(textual specification with JavaDoc)
Math.sqrt(-5);
> 0
/**
????
*/
String transcribe(File audioFile);
/**
????
*/
List<Product> suggestedPurchases(List<Product> pastPurchases);
/**
????
*/
Boolean predictRecidivism(int age,
List<Crime> priors,
Gender gender,
int timeServed,
...);
(Daniel Miessler, CC SA 2.0)
From deductive reasoning to inductive reasoning...
From clear specifications to goals...
From guarantees to best effort...
What does this mean for software engineering?
For decomposing software systems?
For correctness of AI-enabled systems?
For safety?
For design, implementation, testing, deployment, operations?
While it is possible to formally specify programs and prove them correct, this is rarely ever done.
In practice, specifications are often textual, local, weak, vague, or ambiguous, if they exist at all. Some informal requirements and some tests might be the only specifications available.
Software engineers have long development methods to deal with uncertainty, missing specifications, and unreliable components.
AI may raise the stakes, but the problem and solutions are not entirely new.
Survey helps us to form teams (link on Canvas)