PyData Berlin 2025 Notes
I just came back from the PyData conference in Berlin and apart from meeting a lot of great people, I took some tidbits away:
-
Docker’s Cache mounts can be used to solve the problem of a single pip dependency change invalidating the entire Docker cache. So instead of doing:
RUN pip install -r requirements.txt
You can do:
RUN --mount=type=cache,target=/root/.cache/pip \ pip install -r requirements.txt
Or, equivalently, for
uv
RUN --mount=type=cache,target=/root/.cache/uv \ uv sync
This will cache the pip downloads and speed up subsequent builds.
-
You can, through the magic of WebAssembly, run DuckDB in the browser: https://shell.duckdb.org
-
For documentation, people (e.g. like Github, Cloudflare, basically everyone) are more or less following the Diátaxis framework. I have a Dejavu feeling that I have seen this somewhere before under a different name. The idea is to split documentation into four categories: tutorials, how-to guides, explanations, and references. Tutorials and how-to guides are task-oriented, while explanations and references are information-oriented (on a need-to-know basis).
-
How did I miss WebLLM before?
-
For PDF/HTML parsing and text extraction docling is the new hotness and already very promising.