Job Summary
A company is looking for a Site Reliability Engineering Manager to lead a 24x7 site reliability team.
Key Responsibilities:
- Lead and mentor a site reliability team in monitoring, incident response, and resolution of critical systems
- Oversee incident management processes to ensure timely detection and resolution in accordance with SLA targets
- Design and conduct disaster recovery drills to validate system resilience and compliance with security standards
Required Qualifications:
- 8-15 years of relevant experience managing site reliability teams in high-availability environments
- Bachelor of Science Degree or equivalent work experience is preferred
- Expertise in incident management for mission-critical applications
- Experience with SOC and ISO compliance controls
- Strong understanding of NIST frameworks and cloud platforms like Microsoft Azure or AWS
Comments