Christian Kaestner
Required reading: Eric Breck, Shanqing Cai, Eric Nielsen, Michael Salib, D. Sculley. The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction. Proceedings of IEEE Big Data (2017)
Danger of "silent" mistakes in many phases
Source: Eric Breck, Shanqing Cai, Eric Nielsen, Michael Salib, D. Sculley. The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction. Proceedings of IEEE Big Data (2017)
Danger of "silent" mistakes in many phases:
How to test for these? How automatable?
class Algorithms {
/**
* This method finds the shortest distance between to
* verticies. It returns -1 if the two nodes are not
* connected.
*/
int shortestDistance(…) {…}
}
class Algorithms {
/**
* This method finds the shortest distance between to
* verticies. Method is only supported
* for connected verticies.
*/
int shortestDistance(…) {…}
}
import org.junit.Test;
import static org.junit.Assert.assertEquals;
public class AdjacencyListTest {
@Test
public void testSanityTest(){
// set up
Graph g1 = new AdjacencyListGraph(10);
Vertex s1 = new Vertex("A");
Vertex s2 = new Vertex("B");
// check expected results (oracle)
assertEquals(true, g1.addVertex(s1));
assertEquals(true, g1.addVertex(s2));
assertEquals(true, g1.addEdge(s1, s2));
assertEquals(s2, g1.getNeighbors(s1)[0]);
}
// use abstraction, e.g. common setups
private int helperMethod…
}
testShouldReturnLargestNumberInArrayWithoutRepeats()
A Stack
- should pop values in last-in-first-out-order
+ Given a non-empty stack
+ When pop is invoked on the stack
+ Then the most recently pushed element should be returned
+ And the stack should have one less item than before
- should throw NoSuchElementException if an empty stack is popped
+ Given an empty stack
+ When pop is invoked on the stack
+ Then NoSuchElementException should be thrown
+ And the stack should still be empty
import org.scalatest.FunSpec
import scala.collection.mutable.Stack
class StackSpec extends FunSpec {
describe("A Stack") {
it("should pop values in last-in-first-out order") {
val stack = new Stack[Int]
stack.push(1); stack.push(2)
assert(stack.pop() === 2); assert(stack.pop() === 1)
}
it("should throw NoSuchElementException if an empty stack is popped") {
val emptyStack = new Stack[String]
intercept[NoSuchElementException] {
emptyStack.pop()
}
}
}
}
How to avoid?
DataTable getData(Stream stream, DataCleaner cleaner) { ... }
@Test void test() {
Stream stream = openKafkaStream(...)
DataTable output = getData(testStream, new DefaultCleaner());
assert(output.length==10)
}
DataTable getData(Stream stream, DataCleaner cleaner) { ... }
@Test void test() {
Stream testStream = new Stream() {
int idx = 0;
// hardcoded or read from test file
String[] data = [ ... ]
public void connect() { }
public String getNext() {
return data[++idx];
}
}
DataTable output = getData(testStream, new DefaultCleaner());
assert(output.length==10)
}
DataTable getData(KafkaStream stream, DataCleaner cleaner) { ... }
@Test void test() {
DataCleaner dummyCleaner = new DataCleaner() {
boolean isValid(String row) { return true; }
...
}
DataTable output = getData(testStream, dummyCleaner);
assert(output.length==10)
}
DataTable getData(KafkaStream stream, DataCleaner cleaner) { ... }
@Test void test() {
DataCleaner dummyCleaner = new DataCleaner() {
int counter = 0;
boolean isValid(String row) {
counter++;
return counter!=3;
}
...
}
DataTable output = getData(testStream, dummyCleaner);
assert(output.length==9)
}
Mocking frameworks provide infrastructure for expressing such tests compactly.
Think like an attacker: The tester’s goal is to find bugs!
Black box
White-box
Test larger units of behavior
Often based on use cases or user stories -- customer perspective
@Test void gameTest() {
Poker game = new Poker();
Player p = new Player();
Player q = new Player();
game.shuffle(seed)
game.add(p);
game.add(q);
game.deal();
p.bet(100);
q.bet(100);
p.call();
q.fold();
assert(game.winner() == p);
}
Track quality indicators over time, e.g.,
Known as emergent properties and feature interactions
Failure in compositionality: Components developed and tested indepently, but they are not fully independent
Detection and resolution challenging:
Recommended reading: Nhlabatsi, Armstrong, Robin Laney, and Bashar Nuseibeh. Feature interaction: The security threat from within software systems. Progress in Informatics 5 (2008): 75-89.
Examples?
More in a later lecture
@Test void test() {
DataTable data = new DataTable();
try {
Model m = learn(data);
Assert.fail();
} catch (NoDataException e) { /* correctly thrown */ }
}
DataTable getData(Stream stream, DataCleaner cleaner) { ... }
@Test void test() {
Stream testStream = new Stream() {
public void connect() {
throw new IOException("cannot establish connection")
}
public String getNext() {
throw new IOException("connection dropped")
}
}
try {
DataTable output = getData(testStream, new DefaultCleaner());
Assert.fail();
} catch (DataTableException e) { /* correctly handled */ }
}
@Test void test() {
Stream testStream = new Stream() {
int idx = 0;
public void connect() {
if (++idx < 3)
throw new IOException("cannot establish connection")
}
public String getNext() { ... }
}
DataLoader loader = new DataLoader(testStream, new DefaultCleaner());
ModelBuilder model = new ModelBuilder(loader, ...);
// assume all exceptions are handled correctly internally
assert(model.accuracy > .91)
}
class MyNotificationService extends NotificationService {
public boolean receivedNotification = false;
public void sendNotification(String msg) { receivedNotification = true; }
}
@Test void test() {
Server s = getServer();
MyNotificationService n = new MyNotificationService();
Monitor m = new Monitor(s, n);
s.stop();
s.request();
s.request();
wait();
assert(n.receivedNotification);
}
Chaos Monkey: randomly disable production instances
Latency Monkey: induces artificial delays in our RESTful client-server communication layer
Conformity Monkey: finds instances that don’t adhere to best-practices and shuts them down
Doctor Monkey: monitors other external signs of health to detect unhealthy instances
Janitor Monkey: ensures that our cloud environment is running free of clutter and waste
Security Monkey: finds security violations or vulnerabilities, and terminates the offending instances
10–18 Monkey: detects problems in instances serving customers in multiple geographic regions
Chaos Gorilla is similar to Chaos Monkey, but simulates an outage of an entire Amazon availability zone.
{
"version": "1.0.0",
"title": "What is the impact of an expired certificate on our application chain?",
"description": "If a certificate expires, we should gracefully deal with the issue.",
"tags": ["tls"],
"steady-state-hypothesis": {
"title": "Application responds",
"probes": [
{
"type": "probe",
"name": "the-astre-service-must-be-running",
"tolerance": true,
"provider": {
"type": "python",
"module": "os.path",
"func": "exists",
"arguments": {
"path": "astre.pid"
}
}
},
{
"type": "probe",
"name": "the-sunset-service-must-be-running",
"tolerance": true,
"provider": {
"type": "python",
"module": "os.path",
"func": "exists",
"arguments": {
"path": "sunset.pid"
}
}
},
{
"type": "probe",
"name": "we-can-request-sunset",
"tolerance": 200,
"provider": {
"type": "http",
"timeout": 3,
"verify_tls": false,
"url": "https://localhost:8443/city/Paris"
}
}
]
},
"method": [
{
"type": "action",
"name": "swap-to-expired-cert",
"provider": {
"type": "process",
"path": "cp",
"arguments": "expired-cert.pem cert.pem"
}
},
{
"type": "probe",
"name": "read-tls-cert-expiry-date",
"provider": {
"type": "process",
"path": "openssl",
"arguments": "x509 -enddate -noout -in cert.pem"
}
},
{
"type": "action",
"name": "restart-astre-service-to-pick-up-certificate",
"provider": {
"type": "process",
"path": "pkill",
"arguments": "--echo -HUP -F astre.pid"
}
},
{
"type": "action",
"name": "restart-sunset-service-to-pick-up-certificate",
"provider": {
"type": "process",
"path": "pkill",
"arguments": "--echo -HUP -F sunset.pid"
},
"pauses": {
"after": 1
}
}
],
"rollbacks": [
{
"type": "action",
"name": "swap-to-vald-cert",
"provider": {
"type": "process",
"path": "cp",
"arguments": "valid-cert.pem cert.pem"
}
},
{
"ref": "restart-astre-service-to-pick-up-certificate"
},
{
"ref": "restart-sunset-service-to-pick-up-certificate"
}
]
}