Job Summary
A company is looking for a Site Reliability Engineer.
Key Responsibilities
- Build and run large-scale, distributed, fault-tolerant systems while solving operational problems with a software engineering mindset
- Define, measure, and monitor Service Level Objectives (SLOs) and Service Level Indicators (SLIs) in collaboration with product and engineering teams
- Scale systems sustainably through automation and practice sustainable incident response and blameless postmortems
Required Qualifications
- 3-5 years of experience as a Site Reliability Engineer or Software Developer
- Advanced experience with programming/scripting languages such as JavaScript/NodeJS, Go, or Python
- Knowledge in Linux monitoring, troubleshooting, and administration
- Experience with container orchestration platforms such as Kubernetes or Nomad
- Experience working with at least one major Cloud Provider (AWS/Azure/GCP)
Comments