Job Summary
A company is looking for a Site Reliability Engineer.
Key Responsibilities
- Write, configure, and deploy code to improve service reliability and set standards for code quality
- Implement and manage monitoring solutions to ensure visibility and proactive issue detection
- Collaborate with development teams to enhance system reliability and performance, and participate in on-call rotations
Required Qualifications
- Experience with large-scale, distributed systems
- Proficiency in cloud infrastructure, particularly GCP
- Knowledge of monitoring tools such as Dynatrace, Splunk, and OpenTelemetry
- Experience in debugging, troubleshooting, and system architecture analysis
- Familiarity with automated solutions for operational tasks
Comments