Job Summary
A company is looking for a Senior Software Engineer, Compute Infrastructure for Robotics Research.
Key Responsibilities
- Develop mechanisms to launch and manage large compute jobs for multi-modal foundation models in robotics
- Optimize GPU and cluster utilization for efficient model training and evaluation
- Develop observability tools and collaborate with researchers to integrate compute technologies into training pipelines
Required Qualifications
- Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent experience
- 12+ years of industry experience in large-scale MLOps and AI infrastructure
- Experience with ML frameworks like PyTorch, JAX, or TensorFlow
- Deep understanding of Kubernetes and experience with Ray
- Strong programming skills in Python and C++ for system development
Comments