Our First Netflix Data Engineering Summit | by Netflix Technology Blog | Dec, 2023

0
192
Our First Netflix Data Engineering Summit | by Netflix Technology Blog | Dec, 2023


Holden Karau Elizabeth Stone Pedro Duarte Chris Stephens Pallavi Phadnis Lee Woodridge Mark Cho Guil Pires Sujay Jain Tristan Reid Senthilnathan Athinarayanan Bharath Mummadisetty Abhinaya Shetty Judit Lantos Amanuel Kahsay Dao Mi Mick Dreeling Chris Colburn and Agata Gryzbek

Earlier this summer time Netflix held our first-ever Data Engineering Forum. Engineers from throughout the corporate got here collectively to share finest practices on the whole lot from Data Processing Patterns to Building Reliable Data Pipelines. The outcome was a sequence of talks which we at the moment are sharing with the remainder of the Data Engineering group!

You can discover every of the talks under with a brief description of every, or you’ll be able to go straight to the playlist on YouTube right here.

The Netflix Data Engineering Stack

Chris Stephens, Data Engineer, Content & Studio and Pedro Duarte, Software Engineer, Consolidated Logging stroll engineers new to Netflix by means of the constructing blocks of the Netflix Data Engineering stack. Learn extra about how batch and streaming information pipelines are constructed at Netflix.

Data Processing Patterns

Lee Woodridge and Pallavi Phadnis, Data Engineers at Netflix, speak about how one can apply totally different processing methods to your batch pipelines by implementing generic abstractions to assist scale, be extra environment friendly, deal with late-arriving information, and be extra fault tolerant.

Streaming SQL on Data Mesh utilizing Apache Flink

Mark Cho, Guil Pires and Sujay Jain, Engineers from the Netflix Data Platform speak about how a managed Streaming SQL utilizing Apache Flink may help unlock new Stream Processing use instances at Netflix. You can learn extra about Data Mesh, Netflix’s next-generation stream processing platform, right here

Building Reliable Data Pipelines

Holden Karau, OSS Engineer, Data Platform Engineering, talks in regards to the significance of dependable information pipelines and the way to construct them overlaying instruments from testing to validation and auditing. The discuss makes use of Apache Spark for example, however the ideas generalize no matter your particular instruments.

Knowledge Management — Leveraging Institutional Data

Tristan Reid, software program engineer, shares experiences in regards to the Knowledge Management challenge at Netflix, which seeks to leverage language modeling methods and metadata from inner methods to enhance the affect of the >100K memos that flow into inside the firm.

Psyberg, An Incremental ETL Framework Using Iceberg

Abhinaya Shetty and Bharath Mummadisetty, Data Engineers from Netflix’s Membership Data Engineering group, introduce Psyberg, an incremental ETL framework. Learn about how Psyberg leverages Iceberg metadata to deal with late-arriving information, and improves information pipelines whereas simplifying on-call life!

Start/Stop/Continue for optimizing advanced ETL jobs

Judit Lantos, Data Engineer, Member Experience Data Engineering, shares a case research to show an efficient strategy for optimizing advanced ETL jobs.

Media Data for ML Studio Creative Production

In the final 2 many years, Netflix has revolutionized the best way video content material is consumed, nonetheless, there may be important work to be achieved in revolutionizing how motion pictures and television reveals are made. In this video, Sr. Data Engineers Amanual Kahsay and Dao Mi showcase how information and insights are being utilized to perform such a imaginative and prescient.

We hope that our fellow members of the Data Engineering Community discover these movies helpful and fascinating. Please comply with our Netflix Data Twitter account for updates and notifications of future Data Engineering Summits!

Mick Dreeling, Chris Colburn



LEAVE A REPLY

Please enter your comment!
Please enter your name here