aliases:  
tags: Type/Tech/Platform/Tool,area/tech
from: []
related: []
contra: []
to: []
for: 
dateCreated: 2021-12-21, 16:50
dateModified: 2023-12-06, 11:57
version: 1.0
publish: true

Apache Kafka

	Apache Kafka is a distributed event store and stream-processing platform. It is an open-source system developed by the Apache Software Foundation written in Java and Scala. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Kafka can connect to external systems via Kafka Connect, and provides the Kafka Streams libraries for stream processing applications. Kafka uses a binary TCP-based protocol that is optimized for efficiency and relies on a "message set" abstraction that naturally groups messages together to reduce the overhead of the network roundtrip. This "leads to larger network packets, larger sequential disk operations, contiguous memory blocks [...] which allows Kafka to turn a bursty stream of random message writes into linear writes."
	wikipedia:: Apache Kafka

Apache Kafka is a distributed event store and stream-processing platform. It is an open-source system developed by the Apache Software Foundation written in Java and Scala. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Kafka can connect to external systems via Kafka Connect, and provides the Kafka Streams libraries for stream processing applications. Kafka uses a binary TCP-based protocol that is optimized for efficiency and relies on a "message set" abstraction that naturally groups messages together to reduce the overhead of the network roundtrip. This "leads to larger network packets, larger sequential disk operations, contiguous memory blocks [...] which allows Kafka to turn a bursty stream of random message writes into linear writes."

wikipedia:: Apache Kafka

url:: Apache Kafka

a distributed event store and stream-processing platform.
aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds.
uses a binary TCP-based protocol that is optimized for efficiency and relies on a "message set" abstraction that naturally groups messages together to reduce the overhead of the network roundtrip. This "leads to larger network packets, larger sequential disk operations, contiguous memory blocks [...] which allows Kafka to turn a bursty stream of random message writes into linear writes."
named the software after the author Franz Kafka because it is "a system optimized for writing", and he liked Kafka's work.6(https://en.wikipedia.org/wiki/Apache_Kafka#cite_note-6)

Message Broker

Architecture

[Architecture]https://commons.wikimedia.org/wiki/File:Overview_of_Apache_Kafka.svg#/media/File:Overview_of_Apache_Kafka.svg

Streaming, as in near-real-time, getting data as it comes in as opposed to in chunks, delayed, etc.

Immutable, can't be changed
append-only, at the end of the topic

Processes called a producer

stores key-value messages that come from arbitrarily many processes called producers.
consumer
Topic
Partitions
Partition within topic
Kafka runs on a cluster of one or more servers (called brokers), and the partitions of all topics are distributed across the cluster nodes. Additionally, partitions are replicated to multiple brokers. This architecture allows Kafka to deliver massive streams of messages in a fault-tolerant fashion and has allowed it to replace some of the conventional messaging systems like Java Message Service (JMS), Advanced Message Queuing Protocol (AMQP), etc.
Message
Key-value
Offset
Within a partition, messages are strictly ordered by their offsets (the position of a message within a partition), and indexed and stored together with a timestamp.
Kafka Streams API
Cluster run on servers called brokers
For stream processing, Kafka offers the Streams API that allows writing Java applications that consume data from Kafka and write results back to Kafka.
Apache Kafka also works with external stream processing systems such as Apache Apex, Apache Beam, Apache Flink, Apache Spark, Apache Storm and Apache NiFi

Aiven Workshop

Kafka / Python Workshop - Meetups & Events - Aiven Community

Apache Kafka

Interactive graph

On this page

Apache Kafka

Architecture

Aiven Workshop