In this lesson we will:
- Common patterns in stream processing platforms;
- Considerations in build vs buy.
Building A Streaming Data Platform
Over the coming years, many businesses are going to be building real-time streaming data platforms in order to deliver against their business objectives.
Components Of A Streaming Platform
Most of the streaming data platforms that businesses will deploy will follow a similar pattern and use very similar technologies:
- Some mechanism for extracting data from source websites and applications, and turning this data into a stream timestamped events;
- Some streaming data engine, which will usually be Kafka or occasionally a cloud-managed service such as Kinesis;
- A stream processing component, such as Kafka Streams, Flink or Spark Streams to pre-process, analyse and aggregate the streaming data;
- A data lake or warehouse to store the post-processed data and make it available for consumption;
- Various means of accessing and analysing the data, including notebooks, application APIs and reporting front-ends in ways more optimised for real time streaming event based data.