Job Summary
A company is looking for a Senior Applied AI Software Engineer, Distributed Inference Systems.
Key Responsibilities
- Build and optimize the Kubernetes deployment and workload management stack for scalable inference
- Develop production-grade inference workload management systems that can scale from a few to thousands of GPUs
- Enhance intelligent routing and manage distributed KV caches for efficient data movement
Required Qualifications
- BS/MS or higher in computer engineering, computer science, or related field (or equivalent experience)
- 5+ years of proven experience in a related field
- Strong proficiency in systems programming (Rust and/or C++) and experience in Python
- Deep understanding of distributed systems, parallel computing, and GPU architectures
- Experience with cloud-native deployment and container orchestration (Kubernetes, Docker)
Comments