(Part 2: Deep Learning, Symbolic AI)
Christian Kaestner
Required Reading: 🕮 Géron, Aurélien. ”Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”, 2nd Edition (2019), Ch 1.
Recommended Readings: 🕮 Géron, Aurélien. ”Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”, 2nd Edition (2019), Ch 10 ("Introduction to Artificial Neural Networks with Keras"), 🕮 Flasiński, Mariusz. "Introduction to Artificial Intelligence." Springer (2016), Chapter 1 ("History of Artificial Intelligence") and Chapter 2 ("Symbolic Artificial Intelligence"), 🕮 Pfeffer, Avi. "Practical Probabilistic Programming." Manning (2016), Chapter 1 or 🎬 Kevin Smith's recorded tutorial on Probabilistic Programming
Artificial Intelligence:
computers acting humanly / thinking humanly / thinking rationally / acting rationally -- Russel and Norvig, 2003
Machine Learning:
A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E. -- Tom Mitchell, 1997
Deep Learning:
specific learning technique based on neural networks
Russel and Norvig. "Artificial Intelligence: A Modern Approach.", 2003
(zooming out from the last lecture)
For a nontechnical introduction: Pedro Domingos. The Master Algorithm. Basic Books, 2015
Simulating biological neural networks of neurons (nodes) and synapses (connections), popularized in 60s and 70s
Basic building blocks: Artificial neurons, with n inputs and one output; output is activated if at least m inputs are active
(assuming at least two activated inputs needed to activate output)
computing weighted sum of inputs + step function
z=w1x1+w2x2+...+wnxn=xTw
e.g., step: ϕ(z) = if (z<0) 0 else 1
o1=ϕ(b1+w1,1x1+w1,2x2) o2=ϕ(b2+w2,1x1+w2,2x2) o3=ϕ(b3+w3,1x1+w3,2x2)
fW,b(X)=ϕ(W⋅X+b)
(W and b are parameters of the model)
fWh,bh,Wo,bo(X)=ϕ(Wo⋅ϕ(Wh⋅X+bh)+bo
(matrix multiplications interleaved with step function)
Intuition:
Works efficiently only for certain ϕ, typically logistic function: ϕ(z)=1/(1+exp(−z)) or ReLU: ϕ(z)=max(0,z).
See Chapter 10 in 🕮 Géron, Aurélien. ”Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”, 2nd Edition (2019) or any other book on deep learning
model = keras.models.Sequential([
keras.layers.Flatten(input_shape=[28, 28]),
keras.layers.Dense(300, activation="relu"),
keras.layers.Dense(100, activation="relu"),
keras.layers.Dense(10, activation="softmax")
])
How many parameters does this model have?
model = keras.models.Sequential([
keras.layers.Flatten(input_shape=[28, 28]),
# 784*300+300 = 235500 parameter
keras.layers.Dense(300, activation="relu"),
# 300*100+100 = 30100 parameters
keras.layers.Dense(100, activation="relu"),
# 100*10+10 = 1010 parameters
keras.layers.Dense(10, activation="softmax")
])
Total of 266,610 parameters in this small example! (Assuming float types, that's 1 MB)
(Aphex34 CC BY-SA 4.0)
(Good Old-Fashioned Artificial Intelligence)
Given a propositional formula over boolean variables, is there an assignment such that the formula evaluates to true?
(a∨b)∧(¬a∨c)∧¬b
decidable, np complete, lots of search heuristics
Configuration dialog. Some options are mutually exclusive. Some depend on other options.
config HAVE_BOOTMEM_INFO_NODE
def_bool n
# eventually, we can have this option just 'select SPARSEMEM'
config MEMORY_HOTPLUG
bool "Allow for memory hot-add"
depends on SPARSEMEM || X86_64_ACPI_NUMA
depends on ARCH_ENABLE_MEMORY_HOTPLUG
select NUMA_KEEP_MEMINFO if NUMA
config MEMORY_HOTPLUG_SPARSE
def_bool y
depends on SPARSEMEM && MEMORY_HOTPLUG
config MEMORY_HOTPLUG_DEFAULT_ONLINE
bool "Online the newly added memory blocks by default"
depends on MEMORY_HOTPLUG
help
This option sets the default policy setting for memory hotplug
onlining policy (/sys/devices/system/memory/auto_online_blocks) which
determines what happens to newly added memory regions. Policy setting
can always be changed at runtime.
See Documentation/admin-guide/mm/memory-hotplug.rst for more information.
Say Y here if you want all hot-plugged memory blocks to appear in
'online' state by default.
Say N here if you want the default policy to keep all hot-plugged
memory blocks in 'offline' state.
config MEMORY_HOTREMOVE
bool "Allow for memory hot remove"
select MEMORY_ISOLATION
select HAVE_BOOTMEM_INFO_NODE if (X86_64 || PPC64)
depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE
depends on MIGRATION
Describe configuration constraints:
config MEMORY_HOTPLUG
bool "Allow for memory hot-add"
depends on SPARSEMEM || X86_64_ACPI_NUMA
depends on ARCH_ENABLE_MEMORY_HOTPLUG
select NUMA_KEEP_MEMINFO if NUMA
(MEMORY_HOTPLUG => SPARSEMEM || X86_64_ACPI_NUMA)
&&
(MEMORY_HOTPLUG => ARCH_ENABLE_MEMORY_HOTPLUG)
&&
(MEMORY_HOTPLUG && NUMA => NUMA_KEEP_MEMINFO)
...
given configuration constraints ϕ :
Which option can never be selected?
dead(o) = ¬SAT(ϕ∧o)
Which option must always be selected?
mandatory(o) = TAUT(ϕ⇒o)
= ¬SAT(¬(ϕ⇒o))
= ¬SAT(ϕ∧¬o)
Any options that can never be selected together?
mutuallyExclusive(o, p) = ?
given configuration constraints ϕ and already made selections of a and b
Which other options do need to be selected?
mustSelect(o) = ?
Generalization beyond boolean options, numbers, strings, additions, optimization
Example: Job Scheduling
Tasks for assembling a car: { t1, t2, t3, t4, t5, t6 }; values denoting start time
max 30 min: ∀ntn<30
t2 needs to be after t1, t1 takes 10 min: t1+10≤t2
t3 and t4 needs to be after t2, take 2 min: (t2+2≤t3)∧(t2+2≤t4)
t5 and t6 (5 min each) should not overlap: (t5+5≤t6)∨(t6+5≤t5)
Goal: find valid assignment for all start times, or find valid assignment minimizing the latest start time
(reasoning with probabilities)
🕮 Pfeffer, Avi. "Practical Probabilistic Programming." Manning (2016), Chapter 1
val burglary = Flip(0.01)
val earthquake = Flip(0.0001)
val alarm = CPD(burglary, earthquake,
(false, false) -> Flip(0.001),
(false, true) -> Flip(0.1),
(true, false) -> Flip(0.9),
(true, true) -> Flip(0.99))
val johnCalls = CPD(alarm,
false -> Flip(0.01),
true -> Flip(0.7))
...
println("Probability of burglary: " +
alg.probability(burglary, true))
class Person {
val smokes = Flip(0.6)
}
def smokingInfluence(pair: (Boolean, Boolean)) =
if (pair._1 == pair._2) 3.0; else 1.0
val alice, bob, clara = new Person
val friends = List((alice, bob), (bob, clara))
clara.smokes.observe(true)
for { (p1, p2) <- friends }
^^(p1.smokes, p2.smokes).setConstraint(smokingInfluence)
...
println("Probability of Alice smoking: " +
alg.probability(alice.smokes, true))
see GitHub p2t2/figaro and many other languages
Answering queries about probabilistic models
println("Probability of burglary: " +
alg.probability(burglary, true))
println("Probability of Alice smoking: " +
alg.probability(alice.smokes, true))
(CC-BY-SA-3.0 Sevard)
Russel and Norvig. "Artificial Intelligence: A Modern Approach.", 2003
Learn more: