Design and Evolve a New Novel Framework of Semantic Load Shedding in Apache Kafka
Loading...
Date
item.page.authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Semantic load shedding is a critical aspect of data stream processing systems, particularly in high-throughput environments like Apache Kafka, where data processing resources are finite. This research work introduces a novel approach to semantic load shedding in Apache Kafka, leveraging semantic understanding of data streams to prioritize data processing tasks based on their importance and relevance. By dynamically adjusting the processing priority of incoming data streams using semantic metadata, the proposed approach optimizes resource utilization and ensures timely processing of critical data while gracefully handling overload conditions. This thesis presents a detailed framework for semantic load shedding, including mechanisms for semantic metadata extraction, priority assignment, and adaptive load shedding strategies. Through experimental evaluation and performance analysis, it demonstrated the effectiveness and scalability of the proposed semantic load shedding approach in Apache Kafka environments. These findings have the potential to improve the performance of machine learning algorithms and optimize load shedding strategies across many applications. This thesis is characterized by its dynamic nature, since it aims to assess the flexibility of the technique by using automated algorithms to find the optimal sample rate of the system. Workload, data flow, and resources comprise this environment. The results show significant improvements in latency, (when latency bound 1, recall of 35% and recall is 99% when latency bound 9) and throughput (when the latency is 1, throughput is 10000) utilization compared to traditional load shedding techniques, highlighting the potential of semantic load shedding for enhancing the efficiency and reliability of data stream processing systems in real-world deployments.
newline