Explore an innovative approach to data science architecture in this conference talk that leverages Datomic, Spark, and Kafka for scalable real-time analysis of production data without traditional ETL techniques. Discover how immutability, consistent timelines, and multi-database querying enable machine learning models with full traceability in a microservices architecture. Learn about modern stored procedures, pass-by-reference queries, horizontal read scalability, and an immutable messaging substrate. Gain insights into an alternative to lambda and kappa architectures, addressing sensitive data encryption and information security concerns. Understand how this solution eliminates the need for ETL and database synchronization pipelines while maintaining scalability and isolation for both transactional and analytical use cases.
Immutable Data Science with Datomic, Spark and Kafka