Advanced Computing User Day

Energy Efficiency on Cloud-Based Distributed Big Data Processing: Insights from Remote Sensing
2024-12-12 , Mission 1

The exponential growth in data volume and variety across various domains has led to significant challenges in efficiently processing, storing, and managing large-scale datasets. Remote sensing, with its increasingly diverse and complex datasets and demanding computational requirements, exemplifies these challenges as a prominent example of big data processing needs. Cloud computing offers a promising solution for addressing the scalability and resource allocation needs of big data processing by providing a distributed environment where resources can be dynamically managed. A typical cloud-based big data processing platform encompasses infrastructure orchestration, distributed processing frameworks, data access mechanisms, and user interfaces. While this approach enables efficient handling of large-scale data, it also raises concerns regarding energy consumption and carbon footprints. This presentation will delve into the proposed methods and tools aimed at optimizing energy consumption for big data processing within a cloud environment, using remote sensing big data as a representative example. The discussion will be organized around three interrelated topics: establishing an energy-aware benchmarking framework, optimizing infrastructure orchestration for energy efficiency, and implementing energy-efficient task scheduling. Firstly, the benchmarking framework includes applications, data, and monitoring toolkits for collecting and analyzing performance, resource utilization, and more importantly, energy metrics within distributed big data systems. By benchmarking these metrics, we can identify key areas for improvement in energy efficiency. Secondly, optimizing infrastructure orchestration involves proposing resource allocation strategies such as automatic scaling of clusters, container consolidation, and prioritizing workloads, considering energy efficiency as the main criterion. These strategies aim to reduce energy consumption without compromising performance as much as possible, allowing for the benefits to be applied across various big data applications without requiring changes to the existing codebase. Thirdly, a multi-objective task scheduling strategy is introduced to minimize energy consumption while maintaining acceptable execution times at the computing task level. The output of this research includes software components specifically designed to be integrated into widely used remote sensing big data platforms to measure and improve the energy efficiency of distributed big data processing. Additionally, the research will engage the broader community through workshops and mini symposia to disseminate the findings and methodologies developed. By focusing on these strategies, we aim to advance the field of big data processing by providing tools that can be adapted across various domains and promote sustainable practices in cloud-based big data applications.

See also: Photo (871.7 KB)

Adhitya is a PhD candidate at the Faculty of Geo-Information Science and Earth Observation (ITC), University of Twente, specializing in Geo-information Processing. His research, in collaboration with The Center of Expertise in Big Geodata Science (CRIB), focuses on energy-efficiency on earth observation big data processing within cloud computing environmet. He holds an M.Sc. in Logistics Information Technology from Pusan National University, Korea (2013) with the focus on vehicle-to-vehicle wireless communication. For the past 10 years, Adhitya has worked as a lecturer and researcher at the Faculty of Computer Science, Universitas Brawijaya, Indonesia. His experience includes collaborating on national and international projects such as the Indonesia Matching Fund Project, Erasmus+ Micro-Credential, and NICT ASEAN IVO Project. Beyond his primary research, Adhitya has developed a keen interest in the intersection of cloud computing service orchestration and scientific big data processing, particularly in the geospatial domain.

This speaker also appears in: