Job Summary
A company is looking for a Senior Site Reliability Engineer who will enhance the stability and reliability of their global SaaS platform.
Key Responsibilities
- Collaborate with development and engineering teams to integrate SRE practices into the Continuous Delivery model for SaaS applications
- Design and implement tools and automation for managing cloud infrastructure securely and reliably
- Establish and maintain a 24x7 incident response process to meet SLA requirements for SaaS products
Required Qualifications
- Minimum of 5 years of experience in a global multi-tenanted production environment
- Hands-on experience with Kubernetes, AWS/GCP/Azure, and Terraform/Cloudformation/Ansible
- Strong knowledge of Linux fundamentals and experience troubleshooting production issues
- Experience in a 24x7 production environment with a solid understanding of SRE principles
- Exposure to programming languages such as Go, Python, C, or C++ is a plus
Comments