Job Summary
A company is looking for a Site Reliability Engineer.
Key Responsibilities
- Design and evolve systems, tooling, and processes to enhance reliability and performance
- Collaborate with product and infrastructure teams to ensure services are production-ready and resilient
- Define and measure SLIs/SLOs to guide reliability improvements
Qualifications
- Experience operating and scaling production systems in cloud environments (AWS preferred)
- Familiarity with service reliability concepts such as monitoring and incident response
- Comfort working across infrastructure layers including compute and networking
- Strong debugging and systems thinking skills
- Ability to work independently and drive projects from discovery to resolution
Comments