Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

 
 
Session Overview
Session
I1: Industry 1
Time:
Wednesday, 05/Mar/2025:
2:40pm - 3:30pm

Session Chair: Tilmann Rabl, HPI
Location: WE5/00.019

Lecture Hall 2

Show help for 'Increase or decrease the abstract text size'
Presentations

Data-driven Database Engineering at Snowflake

Max Heimel

Snowflake, Germany

Snowflake’s cloud-native data platform processes billions of queries daily and scans petabytes of data, creating unique database engineering challenges. As a service-oriented platform, we rely on detailed query telemetry, workload replay, and staged rollouts to drive continuous engine innovation.

This talk highlights two projects from our Berlin engineering team that exemplify this data-driven approach to database engineering. First, we examine how query telemetry and workload analysis informed the transformation of Snowflake’s analytical execution engine to handle transactional-scale throughput. Second, we explore the redesign of Snowflake’s dynamic join strategy, replacing a static method with a holistic, adaptive algorithm. Through rigorous testing, workload replay, and incremental rollouts, we ensured a smooth production transition for customers.

By examining these case studies, we provide practical insights into the intersection of database research and large-scale system engineering. The lessons learned highlight how data-driven methods can effectively address complex engineering challenges in modern database systems, offering perspectives relevant to both academic researchers and industry practitioners.

Heimel-Data-driven Database Engineering at Snowflake-299_b.pdf


GLLM: Self-Corrective G-Code Generation using Large Language Models with User Feedback

Mohamed Abdelaal1, Samuel Lokadjaja2, Gilbert Engert2

1Software GmbH, Germany; 2TU Darmstadt, Germany

This paper introduces GLLM, an innovative tool that leverages Large Language Models (LLMs) to automatically generate G-code from natural language instructions for Computer Numerical Control (CNC) machining. GLLM addresses the challenges of manual G-code writing by bridging the gap between human-readable task descriptions and machine-executable code. The system incorporates a fine-tuned StarCoder-3B model, enhanced with domain-specific training data and a Retrieval-Augmented Generation (RAG) mechanism. GLLM employs advanced prompting strategies and a novel self-corrective code generation approach to ensure both syntactic and semantic correctness of the generated G-code. The architecture includes robust validation mechanisms, including syntax checks, G-code-specific verifications, and functional correctness evaluations using Hausdorff distance. By combining these techniques, GLLM aims to democratize CNC programming, making it more accessible to users without extensive programming experience while maintaining high accuracy and reliability in G-code generation

Abdelaal-GLLM Self-Corrective G-Code Generation using Large Language Models with User-160_b.pdf


Guardrails for Code Assistants

Vincent Béraudier2, Jay Griffin3, Hugues Juillé2, Viu Long Kong2, Alexander Lang1

1IBM Germany; 2IBM France; 3IBM USA

Coding assistants that use generative AI have become tremendously popular over the last two years. IBM watsonx Code Assistant (WCA) is such a coding assistant, based on IBM's Granite code models, and helps customers to generate, migrate, maintain, explain and document code. We define three guardrails for code assistants that are important in an enterprise context: code similarity, HAP detection, and detection of non-programming-related (NPR) questions. Code similarity detects when generated code is very similar to existing open-source code, and may require attribution when used. HAP detection ensures that the assistant does not respond to, and does not create, code that can be considered Hateful, Abusive or Profane. Finally, NPR detection ensures that the assistant only answers questions it was created and optimized for - questions about software engineering and programming.

We describe how we implemented these guardrails in WCA, including a fast and relevant similarity search across 110 million files of source code and sub-second HAP and NPR detection that is tailored to code assistants - in a multi-tenant system with thousands of concurrent users.

Béraudier-Guardrails for Code Assistants-176_b.pdf


 
Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: BTW 2025 Bamberg
Conference Software: ConfTool Pro 2.6.153+TC
© 2001–2025 by Dr. H. Weinreich, Hamburg, Germany