Job Summary
A company is looking for a Senior Site Reliability Engineer - Midnight.
Key Responsibilities
- Design, build, and maintain scalable systems on AWS, optimizing Kubernetes clusters and automating deployments
- Implement monitoring solutions and lead incident response efforts, collaborating with development teams to define SLOs/SLIs
- Evaluate and adopt new technologies while documenting processes and best practices for continuous improvement
Required Qualifications
- 7+ years of experience in SRE, DevOps, or a related role
- Strong programming proficiency in Python, Golang, or Javascript; Rust experience is advantageous
- Demonstrated experience with AWS, modern cloud architectures, and Kubernetes/EKS
- Proficiency in tools like Helm, Terraform, and CI/CD tools such as Github Actions and ArgoCD
- Experience with monitoring tools like Prometheus and familiarity with the LGTM stack
Comments