Explore pattern matching at scale using finite state machines in this conference talk from Strange Loop. Dive into the challenges of locating data that fits patterns within big data from non-homogeneous sources, focusing on Netflix's approach to improving the sign-up experience through experimentation. Learn about a framework for expressing user journey patterns translated into a Non-Deterministic Finite State Machine, inspired by Ken Thompson's 1968 CACM paper. Discover how this state machine is applied across billions of events using Spark, and how it's made accessible to Data Engineers, Scientists, and Analysts. Gain insights into the development of the "Conduit" framework, including design decisions and challenges encountered. The talk covers topics such as graph data models, wildcards, events in sequence, abstract syntax trees, regular expressions, Apache Spark optimizations, and matching multiple patterns simultaneously. Presented by Ajit Koti and Rashmi Shamprasad, experienced engineers from Netflix's Growth Data Engineering team, this session offers valuable knowledge for those interested in large-scale distributed systems, big data solutions, and data engineering.
Read more
Pattern Matching at Scale Using Finite State Machine