Job Summary
A company is looking for a Site Reliability Engineer (SRE).
Key Responsibilities
- Champion reliability by collaborating with architects and engineers to build resilient systems
- Lead the observability overhaul using modern APM tooling and actionable alerts
- Develop and implement auto-scaling strategies and deployment automation
Required Qualifications
- 5+ years in SRE, DevOps, or Production Engineering roles, preferably in a SaaS or cloud-native environment
- Deep experience with cloud platforms (preferably Azure or AWS) and Infrastructure-as-Code tools
- Hands-on experience with Azure DevOps is strongly preferred
- Proficiency with observability tools such as New Relic, Datadog, or Prometheus
- Ability to code in at least one modern scripting or systems language (e.g., Python, PowerShell, Go, Bash)
Comments