Job Summary
A company is looking for a Senior DGX Cloud Software Engineer specializing in Infrastructure Automation and Distributed Systems.
Key Responsibilities
- Design, build, and run cloud infrastructure services to meet business goals
- Define internal service level objectives and error budgets as part of the observability strategy
- Automate processes to eliminate toil and participate in incident prevention and response
Required Qualifications
- Proficiency in Python or Go programming languages
- BS degree in Computer Science or a related technical field, or equivalent experience
- 5+ years of experience in infrastructure and fleet management engineering
- Experience with infrastructure automation and distributed systems design for large scale cloud systems
- In-depth knowledge of Linux, Slurm, Kubernetes, Local and Distributed Storage, and Systems Networking
Comments