Lessons from managing a SLURM cluster + containerized ML workloads for a research group.
What a small lab actually needs
Fair-share scheduling, reasonable defaults, and a container story that doesn't require sudo.
Tools
SLURM for scheduling, Podman for rootless containers, Ansible for config drift.
This is paragraph 1. Data science work is part craft and part discipline — the best models are simple, well-validated, and easy to explain. In this post we walk through the intuition, the math just enough to be useful, and a clean implementation you can drop into your own pipeline.
This is paragraph 2. Data science work is part craft and part discipline — the best models are simple, well-validated, and easy to explain. In this post we walk through the intuition, the math just enough to be useful, and a clean implementation you can drop into your own pipeline.