Eunsuk Kang
Required Reading: Practical Solutions for Machine Learning Safety in Autonomous Vehicles. S. Mohseni et al., SafeAI Workshop@AAAI (2020).
Understanding Machine Learning, Bhogavalli (2019)
Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures, M. Fredrikson et al. in CCS (2015).
Biscotti: A Ledger for Private and Secure Peer-to-Peer Machine Learning, M. Shayan et al., arXiv:1811.09904 (2018).
More miles tested => safer?
(system level!)
Software is not unsafe on its own; the control signals it generates may be
Root of unsafety usually in wrong requirements & environmental assumptions
Component | Failure Mode | Failure Effects | Detection | Mitigation |
---|---|---|---|---|
Perception | ? | ? | ? | ? |
Perception | ? | ? | ? | ? |
Lidar Sensor | Mechanical failure | Inability to detect objects | Monitor | Switch to manual control mode |
... | ... | ... | ... | ... |
Component | Failure Mode | Failure Effects | Detection | Mitigation |
---|---|---|---|---|
Perception | Failure to detect an object | Risk of collision | Human operator (if present) | Deploy secondary classifier |
Perception | Detected but misclassified | " | " | " |
Lidar Sensor | Mechanical failure | Inability to detect objects | Monitor | Switch to manual control mode |
... | ... | ... | ... | ... |
Image: An abstract domain for certifying neural networks. Gagandeep et al., POPL (2019).
Image: David Silver. Adversarial Traffic Signs. Blog post, 2017
Image: Automated driving recognition technologies for adverse weather conditions. Yoneda et al., IATSS Research (2019).
How do we demonstrate to a third-party that our system is safe?
Build a safety case to argue that your movie recommendation system provides at least 80% availability. Include evidence to support your argument.
(See Mitigation Strategies from the Lecture on Risks)
Amodei, Dario, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. "Concrete problems in AI safety." arXiv preprint arXiv:1606.06565 (2016).
Amodei, Dario, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. "Concrete problems in AI safety." arXiv preprint arXiv:1606.06565 (2016).
PlayFun algorithm pauses the game of Tetris indefinitely to avoid losing
When about to lose a hockey game, the PlayFun algorithm exploits a bug to make one of the players on the opposing team disappear from the map, thus forcing a draw.
Self-driving car rewarded for speed learns to spin in circles
Amodei, Dario, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. "Concrete problems in AI safety." arXiv preprint arXiv:1606.06565 (2016).
Amodei, Dario, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. "Concrete problems in AI safety." arXiv preprint arXiv:1606.06565 (2016).
Examples?
Look at apps on your phone. Which apps have a safety risk and use machine learning?
Consider safety broadly: including stress, mental health, discrimination, and environment pollution