Skip to content
ADP
API Design PrincipleBETA

[ADP-609] Event Partitioning

Overview

This ADP defines a standard method for implementing event partitioning using the CloudEvents partitioning extension and discusses its application in message brokers such as Kafka.

Guidelines

  1. MUST use the CloudEvents partitioning extension to implement event partitioning.

  2. Event producers MUST set the partitionkey attribute for events that require partitioning.

  3. The partitionkey attribute MUST be a non-empty string.

  4. System designers SHOULD carefully choose partition keys to ensure load balancing and correct grouping of related events.

  5. Event consumers SHOULD be able to handle partition-based event streams.

CloudEvents Partitioning Extension

The partitioning extension adds one attribute to CloudEvents:

  • partitionkey: A key used for event partitioning, typically used to define causal relationships or grouping among multiple events.

Example CloudEvent:

json
{
    "specversion" : "1.0",
    "type" : "com.example.someevent",
    "source" : "/mycontext",
    "id" : "C234-1234-1234",
    "time" : "2023-06-01T10:30:00Z",
    "partitionkey": "user-123",
    "data" : {
        "message" : "This event is partitioned"
    }
}

Application in Kafka

Kafka is a widely used distributed event streaming platform that natively supports event partitioning:

  1. Kafka uses topics to organize event streams, and each topic can have multiple partitions.

  2. When sending events to Kafka, the CloudEvents partitionkey can be used as Kafka's partition key.

  3. Kafka uses the hash of the partition key to determine which partition an event should be written to.

  4. Events with the same partitionkey will be written to the same partition, ensuring ordered processing.

Implementation Recommendations

  1. Choose appropriate partitioning strategies, considering load balancing and event ordering requirements.

  2. In Kafka producer configurations, map the CloudEvents partitionkey to Kafka's partition key.

  3. Ensure consumers can process events from different partitions in parallel.

  4. Monitor partition usage and adjust the number of partitions or partitioning strategy when necessary.

  5. Consider implementing dynamic partition assignment to adapt to load changes.

Use Cases

  1. User Activity Tracking: Use user ID as the partition key to ensure all events for a single user are processed in order.

  2. IoT Data Processing: Use device ID as the partition key to group data from the same device.

  3. Order Processing System: Use order ID as the partition key to ensure order-related events are processed sequentially.

Security Considerations

  1. Ensure partition keys do not contain sensitive information, as they may be widely propagated in the system.

  2. Be aware of potential performance issues and denial-of-service risks that could arise from uneven partitioning.

  3. Implement appropriate access controls to prevent unauthorized partition operations.

References