Главная
Study mode:
on
1
Intro
2
This is a true story
3
Active probing system requires explicit and efficient probing
4
Observation vs. inference from path probing to failures
5
Real-world constraints complicate path selection
6
Device failure detection
7
Link failure inference: an optimization problem
8
Real world data inconsistency induces false positives
9
Evaluation questions
10
Real cases spine router gray failure
11
Accuracy comparison with previous algorithms
12
NetBouncer algorithm performance
13
NetBouncer has negligible averheads on the server side
Description:
Explore a comprehensive presentation on NetBouncer, an active failure localization system for data center networks. Learn how this innovative solution leverages IP-in-IP techniques to detect both device and link failures, ensuring high availability of data center services. Discover the challenges of accurately localizing failures among millions of servers and network devices, and understand how NetBouncer's algorithm integrates troubleshooting domain knowledge with machine learning to overcome real-world data inconsistencies. Gain insights into the system's deployment in Microsoft Azure's data centers, its performance in detecting spine router gray failures, and its negligible overheads on the server side. Delve into the intricacies of active probing, path selection, device failure detection, and link failure inference as you examine this robust framework for maintaining data center network reliability.

NetBouncer - Active Device and Link Failure Localization in Data Center Networks

USENIX
Add to list