[ENG] Enabling Scalable, Efficient, and Collaborative Scientific Workflows with modern data lake architecture SURF Netwerk & Cloud event

[ENG] Enabling Scalable, Efficient, and Collaborative Scientific Workflows with modern data lake architecture
.ical
30-9-2025 13:50–14:25, Erik de Vries

As research becomes increasingly data-driven and collaborative, there is a critical need for modern, scalable infrastructure to manage vast and diverse datasets. Traditional data management systems, which often rely on rigid hierarchies and predefined schemas, are proving inadequate in the face of growing data volumes, variety, and velocity. To address these challenges, we are examining the concept of a data lake: an open, flexible, and powerful architecture for storing and analysing research data across disciplines and formats.

A data lake is a centralised repository that stores data in its raw form, accommodating structured, semi-structured, and unstructured formats. Unlike conventional data warehouses, it uses flat object storage combined with rich metadata tagging to enable efficient, scalable data access. This architecture supports a wide range of analytical and machine learning tools without requiring data to be moved or duplicated, thereby increasing cost-efficiency and reducing complexity.
Find out how our proposed architecture paves the way for more reproducible, transparent, and efficient scientific workflows, empowering researchers to derive deeper insights and drive innovation at scale.

This session is in ENGLISH

David Salek

David Šálek is a cloud solutions architect at SURF with a background in scientific research. He helps research teams make the most of digital opportunities by designing and building secure, scalable environments across public, private, and hybrid clouds.

[ENG] Enabling Scalable, Efficient, and Collaborative Scientific Workflows with modern data lake architecture .ical 30-9-2025 13:50–14:25, Erik de Vries

[ENG] Enabling Scalable, Efficient, and Collaborative Scientific Workflows with modern data lake architecture
.ical
30-9-2025 13:50–14:25, Erik de Vries