REWE Group — one of Europe's largest retail groups. Production AI and enablement.
REWE Group — production AI delivery, internal GenAI products, and hands-on enablement.
AI Engineer · Cologne, Germany
I build production AI systems that make messy operational work measurable, faster, and easier to trust.
I build eval-first LLM systems with structured outputs, tracing, and agentic tooling that stay controllable in production.
Hours gathered during thesis, research, implementation, problem solving, writing code, removing code, and reviewing code.
Estimated from daily routines: ~6h on weekdays, ~2h on weekends, counted from Dec 1, 2023. Not machine-tracked — honest approximation.
REWE Group — one of Europe's largest retail groups. Production AI and enablement.
REWE Group — production AI delivery, internal GenAI products, and hands-on enablement.
Practical AI systems that fit real workflows and stay measurable.
Agentic engineering, evaluation, tracing, and practical adoption.
Peer-reviewed work on context-aware emotion analysis with LLMs.
Research on operationalizing context-aware emotion analysis with LLM-guided pipelines.
About
I build LLM systems that stay boring, measurable, and controllable — because that's where they actually create business value. Research background, production mindset.
Based in Cologne. Currently at REWE Group. Working across production AI, evaluation, and applied LLM systems.
Production AI applications, agentic engineering, evaluation, and enablement.
Agentic systems, evaluation, and proof-of-concept architectures.
LLM systems, evaluation, and context-aware emotion analysis.
Foundation in business systems, software, and applied AI.
Work
Switch between Business and Dev in the header to see the same work at the right level of detail.
Let AI handle the boring parts of software delivery.
Build real products powered by language models.
Help teams actually adopt AI, not just experiment.
Tech Stack
Cloud-native stack for building and running LLM applications at production scale, with a strong bias toward evaluation and observability.
Capabilities
Personal Focus
Understand where LLMs actually create value. Prefer simple, controllable usage over unnecessary complexity. Maintain and improve systems with a mix of human control and agentic harnesses. Make quality evaluable and quantifiable through custom evaluations.
Featured Case Study
A production system at REWE Group that turns product packaging photos into structured master data — so the online shop can legally match what customers see in store, and new products reach the shelf faster.
The master data team came to us: manual label transcription wasn't keeping up. Consumer-protection law requires the online shop to match the physical packaging (country of origin, allergens, nutrition). Trend products need to be listed fast. And supplier data is fragmented — there's no universal API for "what is printed on this package". Photos and videos are.
Two tools, shipped by a team of two in six months:
The hard part wasn't the model. The stakeholders own the problem; I own the solution. The real work is the translation layer between them — explaining error modes, accuracy trade-offs, and why "96% with a good review UI" beats "100% someday". Good engineering here means knowing which decisions are mine to make and which ones aren't.
Architecture details, evaluation methodology, and lessons from production are best discussed in conversation.
Reach out on LinkedInResearch
The research thread is the same one I use in production: make LLM behaviour legible, testable, and grounded in something more robust than intuition. That mindset shaped how I learned to build measurable systems.
A context-aware LLM pipeline based on Lisa Feldman Barrett's theory of constructed emotion.
Full operationalization via the "context sphere" - a user-specific construct for nuanced emotion analysis.
Contact
Best reached via LinkedIn. Happy to discuss applied AI, agentic engineering, evaluation, or research collaborations.