We're looking for an Infrastructure Engineer to lead the scaling of our machine learning inference system. You'll be responsible for architecting and maintaining infrastructure that serves 150+ biological ML models, scaling our platform several orders of magnitude to meet rapidly growing demand. You’ll work closely with the founders to design to the constraints of customer needs, unpredictable workloads, and unique Bio-ML models. You'll work with Kubernetes and other tools to orchestrate containerized workloads, optimize resource allocation, and ensure high availability across our model serving infrastructure. Most importantly, you should thrive in a fast-paced startup environment where you'll wear multiple hats, learn new technologies quickly, and help solve novel technical challenges. We value engineering judgment, problem-solving ability, and the capacity to build systems that can evolve with our growing needs. Requirements Solid programming and automation skills Experience with containerization and orchestration concepts Cloud platform knowledge (AWS/GCP/Azure) Located in the SF Bay Area or able to relocate to the Bay Area Onsite expectation: Team currently onsite in SF ~5 days/week. Preferred Experience scaling production systems Kubernetes experience Infrastructure as code tools (Terraform, Pulumi) Monitoring and observability tools Experience with GPU workloads
First seen: 2026-01-08 17:48
Last seen: 2026-01-08 20:49