2025-12-04 –, Expedition
SURF will start next year to investigate a Data Lakehouse, among others to explore its application in scientific workflows.
I will briefly discuss the concept of Data Lakehouse, its architecture and components. One of their characteristics is that they have some functionality like consistency similar to a Data warehouse, but they can process unstructured and semi-structured data. We did already some investigations in a number of projects encompassing scientific fields such as earth observation, sentiment analysis, and bio-imaging. I will share some preliminary insights to what kind of scientific use workflows it can be applied. And will show how we can use the various Data Lakehouse components in the workflow. The talk will also touch upon commercial solutions like Data Bricks that have full stack, including an ML-ops component, and open source solutions.
I will share some preliminary insights to what kind of scientific use workflows it can be applied. And will show how we can use the various Data Lakehouse components in the workflow. The talk will also touch upon commercial solutions like Data Bricks that have full stack, including an ML-ops component, and open source solutions.
Robert Griffioen has a background in Artificial Intelligence. He did a Ph.D. about brain modelling with neural networks and a postdoc about large scale agent based simulations in an European project. After that he worked as a Business Intelligent consultant for a few years. Then he worked at Statistic Netherlands (Centraal Bureau for the Statistiek) in IT and Research for almost 10 years. Finally, he landed at SURF as among others a project manager, product manager and cloud solution architect for scientific projects.