William Kau.
AI & Software Engineer.

M2 engineering student at ESILV, building AI solutions around LLMs and synthetic data. Currently a Software Engineer intern at Aubay Solutec, with prior experience at BPCE SI and Manaos.

Based in
Paris, FR
Studying
ESILV - M2
Now
Aubay Solutec
Focus
LLMs · Data
Python
SQL
C#
Java
FastAPI
Angular
React
Next.js
PyTorch
TensorFlow
Hugging Face
LangChain
SDV
SynthCity
PostgreSQL
Oracle SQL
MinIO
GCP
Docker
Git
Power BI
Streamlit
Python
SQL
C#
Java
FastAPI
Angular
React
Next.js
PyTorch
TensorFlow
Hugging Face
LangChain
SDV
SynthCity
PostgreSQL
Oracle SQL
MinIO
GCP
Docker
Git
Power BI
Streamlit
About

Engineer driven by data and curiosity.

Portrait of William Kau

I'm William, a final-year Data Science & AI engineering student at ESILV Paris La Défense, graduating in June 2026. I'm currently a Software Engineer intern at Aubay Solutec in Lyon, where I train generative models (CTGAN, TVAE, DDPM) to produce synthetic data faithful to real-world distributions, and ship them inside a FastAPI / Angular micro-services app.

My work lives at the intersection of machine learning, LLM systems and production engineering: generative models, RAG pipelines with traceable citations, and agentic loops, deployed with FastAPI, Docker and GCP. I care about AI systems people can actually trust, inspect, and run.

Outside the screen, I'm a setter on a competitive volleyball team, I run trails, climb, and shoot photographs. More often than I expected, these interests seep back into the way I think about code.

5+
Years coding
3
Internships
1st
ESILV PI2 prize
975
TOEIC / 990
Experience

Where I've been.

Three internships at the intersection of data engineering, AI and product, each one a step deeper into building things that ship.

  1. Feb 2026 - Present

    Software Engineer - Internship @ Aubay Solutec

    Lyon, France

    Selecting and tuning 4 generative algorithms (CTGAN, TVAE, ARGN, DDPM) to produce synthetic data faithful to real-world distributions. Designed a persistence pipeline for trained models and generated datasets (MinIO + PostgreSQL), and built the fullstack micro-services app (FastAPI / Angular) with JWT authentication (OAuth2, bcrypt).

    FastAPIAngularPyTorchCTGANTVAEDDPMMinIOPostgreSQLJWT
  2. Apr 2025 - Aug 2025

    Data Analyst - Internship @ BPCE SI

    Paris 13e, France

    Optimised SQL queries on Oracle databases, cutting execution time of recurring processes by up to 60%. Industrialised BI reports and contributed to the migration of the data heritage to Google Cloud Platform.

    Oracle SQLPower BIGCP
  3. Sept 2024 - Apr 2025

    Fullstack Developer - School Project @ MANAOS - BNP Paribas

    Paris 8e, France
    1st prize - ESILV PI2 2024-2025

    Built an ESG data management application in Python / Streamlit (team of 6) with an integrated open-source LLM (Hugging Face) to query the data in natural language.

    PythonStreamlitHugging FaceLLMESG
Selected work

Projects I'm proud of.

A mix of school, internship and personal work, usually somewhere between AI research and shipping software.

Photo composition analyzer interface: rule of thirds, leading lines and depth of field scores with a Grad-CAM heatmap overlay
2025

Photographic Composition Analysis

Computer vision for visual aesthetics

A vision model that scores photographs on composition rules (rule of thirds, leading lines, depth of field), with explainability via Grad-CAM. Trained on a personal dataset I annotated from my own photographs.

PythonPyTorchFastAPIReactGrad-CAM
ESG_RAG interface: question-answer panel in French with retrieval settings (reranking, hallucination detection, self-consistency) and document filters
2025

Financial & ESG RAG

Retrieval-augmented Q&A with source traceability

A retrieval-augmented system for querying financial and ESG reports in natural language. 100% local stack with semantic retrieval, source citations and a hallucination-detection layer (NLI + grounding + self-consistency).

PythonOllamaChromaFastAPIStreamlit
NLP → SQL Agent interface: natural-language question, generated SQL with validation status, and result table
2025

NLP → SQL Agent

Natural language to validated SQL, end-to-end

A full-stack app where users ask questions in natural language and an Anthropic-powered agent generates, validates and executes SQL against a database. The UI surfaces the final SQL, validation status, the agent's retry attempts and the result table.

PythonAnthropicSQLAgents
2026

Synthetic Data Generation Platform

Generative models, productionised

Ongoing at Aubay Solutec: a fullstack micro-services platform that picks and tunes generative algorithms (GAN, VAE, ARGN, DDPM) based on the input dataset. FastAPI + Angular, MinIO and PostgreSQL on Nexus.

FastAPIAngularGANVAEDDPMPostgreSQL
Off-screen

What keeps me sharp.

The hours I spend away from a keyboard, and why they end up shaping my engineering work more than I expected.

Volleyball

Setter for a competitive amateur team in Lognes. The position taught me a lot about anticipating, reading patterns, and making quick calls under pressure.

Running & Trail

Long runs and trail outings: the rhythm I rely on to think through hard problems, away from any screen.

Climbing

On the wall I get to optimise something different: balance, route reading, body tension. Debugging a route is a lot like debugging code.

Photography

I shoot mostly in available light. Photography is what got me into computer vision, and the source of the dataset for my composition-analysis project.

View gallery