Explore a conference talk on scaling machine learning models to improve SRE and security efficacy. Learn how to leverage millions of ML models operating on petabytes of operational and user data to enhance zero trust security frameworks and infrastructure diagnosis. Discover the challenges and solutions in implementing machine learning and anomaly detection on Kubernetes nodes and Envoy-based service mesh. Gain insights into collecting data from hundreds of thousands of nodes, handling high cardinality of models, and distributing inference models to K8s nodes. Understand the integration of open-source technologies like Kubernetes, Prometheus, Cortex, Apache Spark, and Apache Arrow in a production deployment. Delve into the complex architecture, infrastructure scaling, and Databricks integration. Examine code snippets and practical applications of the Group pandas API and Apache Arrow.
Scaling to Millions of ML Models to Solve the Problems of SRE and Security