Job Summary
A company is looking for a Site Reliability Engineer to ensure the reliability and performance of its infrastructure.
Key Responsibilities
- Ensure the reliability, uptime, and performance of production systems across multiple cloud environments
- Lead incident response and continuous improvement initiatives
- Design and maintain monitoring and alerting systems using Datadog
Required Qualifications
- At least 6 years of relevant experience as a Site Reliability Engineer or Full Stack Engineer
- Extensive hands-on experience with full-stack development using React, Node, and Typescript
- Strong proficiency in GraphQL and multi-cloud environments (AWS, GCP)
- Proven experience with incident response and root cause analysis
- Proficiency in Infrastructure as Code (IaC) using Terraform
Comments