Explore the development of Mnemosyne, a distributed indexing layer for big data, in this 52-minute Devoxx conference talk. Dive into the challenges faced by Brandwatch Audiences product and learn how they built a system capable of handling hundreds of millions of social network profiles, billions of posts, and tens of billions of follower graph edges in real-time. Discover the fusion of succinct data structures, free text search, in-memory computing with JVM, CUDA, and Kafka to create a high-performance solution. Gain insights into CAP theorem trade-offs, brute force approaches versus indexes, data structures for sorting billions of records in milliseconds, GPU problem-solving, and JVM implementation. Understand key design principles, data ingestion and storage techniques, in-memory computing concepts, and the use of bitmaps and sparse data structures. Evaluate the project's success and consider the feasibility and advisability of undertaking similar endeavors in your own work.
Mnemosyne - A Distributed Bitmapped Indexing Layer for Big Data