
Speaker "Ryan Wright" Details Back


-
Name
Ryan Wright
-
Company
Thatdot, Inc.
-
Designation
CEO
Topic
Introducing Quine: a Streaming Graph for Modern Data Pipelines
Abstract
This talk will introduce Quine: a brand new and freely available "steaming graph interpreter" meant as a new fundamental infrastructure component to address major challenges in data engineering, simplifying enterprise data pipelines and their extending their capabilities. Quine fits in between the world of databases and stream processing systems. As data streams in from Kafka, Kinesis, or other high-volume data sources, Quine builds it into a graph. Then using "standing queries"—queries that live inside the graph and efficiently propagate—it finds matches to complex patterns in the graph and streams the results out right away. Quine maintains a stateful representation of all data streamed through (like a database) so that complex results are built from the combination of new streaming data and potentially very old data—all without having to manage any time windows. Since the graph is fully versioned, you can always query for what the data used to be, at any historical moment. Quine is meant to be a complete package of everything that lives between two Kafka topics: high-volume events stream in, and highly-meaningful interpreted results stream out. In this talk, I will explain the how Quine works under the hood, discuss some of the interesting and brain-bending challenges we had to confront in order to create it, and show some uses cases to illustrate why it's important for modern data pipelines. Quine implements a property-graph data model on top of an asynchronous graph computational model. It's like Pregel with Actors. Each node is capable of performing arbitrary computation, so we can bake in some powerful capabilities deep in the graph; and then package it up for easy use into user-contributed "recipes" available in the Github repo. Quine is free and open to all, available at https://quine.io and actively supported by thatDot and the community.
Who is this presentation for?
Data engineers and data scientists.
Prerequisite knowledge:
Very basic familiarity about how data is processed in data pipelines.
What you'll learn?
about a new open source tool which can dramatically simplify data engineers' and data scientists' jobs and help them accomplish work they couldn't before.