Job Summary
A company is looking for a Senior Site Reliability Engineer.
Key Responsibilities:
- Monitor and maintain mission-critical production services to ensure maximum uptime
- Design and implement scalable distributed systems for self-driving vehicles
- Develop an incident management framework and promote a culture of continuous learning
Required Qualifications:
- Expertise in at least one scripting language (e.g. Bash, Python)
- Fundamental understanding of Linux operating system internals, TCP/IP networking, and storage subsystems
- Experience scaling and securing services in cloud environments (AWS, GCP)
- Experience with infrastructure-as-code principles (e.g. Terraform, CloudFormation)
- Strong experience with cloud native and open source tools such as Kubernetes and Prometheus
Comments