Explore LinkedIn's derived data storage system, Venice, in this conference talk from Strange Loop 2022. Discover how Venice provides high-throughput ingestion of data from batch and stream processing jobs while offering low latency online serving. Learn about its production usage, hosting ~1500 datasets that are rewritten daily and used for AI model inference workloads. Understand Venice's role in the "People you may know" feature, which performs online deep learning with millions of reads and computations per second. Examine how client applications can utilize Venice's data plane and APIs for both eager loading and network queries. Delve into Venice's architecture, designed for massive scale and operability, supporting self-healing, linear scalability, multi-tenancy, and multi-datacenter replication. Gain insights from Felix GV, Principal Staff Engineer at LinkedIn, as he shares his experience developing Venice from its inception to its current state as a crucial component of LinkedIn's data infrastructure.
Read more