[ADP-609] Event Partitioning
Overview
This ADP defines a standard method for implementing event partitioning using the CloudEvents partitioning extension and discusses its application in message brokers such as Kafka.
Guidelines
MUST use the CloudEvents partitioning extension to implement event partitioning.
Event producers MUST set the
partitionkey
attribute for events that require partitioning.The
partitionkey
attribute MUST be a non-empty string.System designers SHOULD carefully choose partition keys to ensure load balancing and correct grouping of related events.
Event consumers SHOULD be able to handle partition-based event streams.
CloudEvents Partitioning Extension
The partitioning extension adds one attribute to CloudEvents:
partitionkey
: A key used for event partitioning, typically used to define causal relationships or grouping among multiple events.
Example CloudEvent:
{
"specversion" : "1.0",
"type" : "com.example.someevent",
"source" : "/mycontext",
"id" : "C234-1234-1234",
"time" : "2023-06-01T10:30:00Z",
"partitionkey": "user-123",
"data" : {
"message" : "This event is partitioned"
}
}
Application in Kafka
Kafka is a widely used distributed event streaming platform that natively supports event partitioning:
Kafka uses topics to organize event streams, and each topic can have multiple partitions.
When sending events to Kafka, the CloudEvents
partitionkey
can be used as Kafka's partition key.Kafka uses the hash of the partition key to determine which partition an event should be written to.
Events with the same
partitionkey
will be written to the same partition, ensuring ordered processing.
Implementation Recommendations
Choose appropriate partitioning strategies, considering load balancing and event ordering requirements.
In Kafka producer configurations, map the CloudEvents
partitionkey
to Kafka's partition key.Ensure consumers can process events from different partitions in parallel.
Monitor partition usage and adjust the number of partitions or partitioning strategy when necessary.
Consider implementing dynamic partition assignment to adapt to load changes.
Use Cases
User Activity Tracking: Use user ID as the partition key to ensure all events for a single user are processed in order.
IoT Data Processing: Use device ID as the partition key to group data from the same device.
Order Processing System: Use order ID as the partition key to ensure order-related events are processed sequentially.
Security Considerations
Ensure partition keys do not contain sensitive information, as they may be widely propagated in the system.
Be aware of potential performance issues and denial-of-service risks that could arise from uneven partitioning.
Implement appropriate access controls to prevent unauthorized partition operations.