Job Summary
A company is looking for a Senior AI Infrastructure Engineer - DGX Cloud.
Key Responsibilities
- Design, build, deploy, and run internal tooling for large scale AI training and inferencing platforms
- Conduct performance characterization and analysis on large multi-GPU and multi-node clusters
- Maintain and monitor live services, ensuring system health and reliability
Required Qualifications
- BS degree in Computer Science or a related technical field, or equivalent experience
- 6+ years of relevant experience
- Experience with infrastructure automation and distributed systems design
- Proficiency in one or more programming languages such as Python, Go, C/C++, or Java
- In-depth knowledge of Linux, Networking, Storage, and Container Technologies
Comments