Job Summary
A company is looking for a Site Reliability Engineer - Observability.
Key Responsibilities
- Define and maintain an observability framework across products, ensuring coverage for various services and establishing SLIs/SLOs
- Build alerting rules and integrate observability into incident management workflows while maintaining reliability metrics
- Deliver reliability analysis and insights, translating data into executive insights and trends
Required Qualifications
- 5-8 years of experience in SRE, Observability, or Reliability roles, ideally in fintech, SaaS, or data platforms
- Strong knowledge of observability tools such as Grafana, Prometheus, and OpenTelemetry
- Experience with distributed systems, APIs, and data pipelines, along with strong automation skills (Kubernetes)
- Proficient in at least one programming language, with a preference for C# or Go
- Systems thinker with a proactive mindset and a collaborative approach across teams
Comments