Главная
Study mode:
on
1
Intro
2
Runtime checker (aka. detector/monitor)
3
Importance of runtime checker
4
Current checking practice
5
Complex internals of modern software
6
Common to exhibit gray failures
7
A real-world gray failure
8
Failure root cause
9
Ideal runtime checkers
10
A new approach
11
Panorama: capture in-situ observability
12
Convert a program into in-situ observer
13
Identify observation boundary and identities
14
Extract evidence
15
Example of analysis
16
Detecting real-world gray failures
17
Timeline of detecting failure case f1
18
Latency overhead to observers
19
Program reduction approach
20
Why doing reduction?
21
identify long-running regions
22
select checking target candidates
23
reduce long-running methods
24
encapsulate checkers
25
insert watchdog hooks
26
Prevent side effects
27
Watchdog generation
28
Failure detection evaluation setup
29
Detecting real-world failures
30
Silent semantic violations
31
Real-world failure study
32
Oathkeeper: detect silent semantic violation
33
How to express semantics?
34
Oathkeeper workflow
35
Emitting semantic event traces
36
General semantic rule templates
37
Extracted semantic rules
38
Runtime overhead
39
Conclusions
Description:
Explore three systematic techniques for automatically generating effective, customized runtime checkers for large distributed systems in this 40-minute Strange Loop Conference talk. Learn about Panorama's approach to capturing in-situ observability, a program reduction method for identifying long-running regions and inserting watchdog hooks, and Oathkeeper's strategy for detecting silent semantic violations. Discover how these techniques can help detect and localize unexpected subtle failures in complex production environments, improving the reliability and availability of modern distributed systems. Gain insights from real-world failure studies and performance evaluations presented by Ryan Huang, an Assistant Professor at Johns Hopkins University specializing in computer systems research.

Automatic Generation of Runtime Checkers for Production Distributed Systems

Strange Loop Conference
Add to list