Session | ||
DE4DS 1: Workshop on Data Engineering for Data Science 1
KeynoteLaura KoestenUniversity of Vienna, Austria | ||
Presentations | ||
ALPINE: Abstract Language for Pipeline Integration and Execution University of Hagen, Germany When working with data, it is essential to ensure data quality and clean data of errors. This is usually done with a data cleaning pipeline. The execution of such a pipeline is possible with a variety of tools. However, to increase reusability, there should be a way to describe pipelines independently of technology. So far, there is no such technology-independent description. This paper therefore presents ALPINE, a language for describing data cleaning pipelines. This abstracts from the concrete implementation. After introducing the individual components, the usage is illustrated using a running example.
|