Job Summary
A company is looking for a Site Reliability Engineer to manage and improve their platform.
Key Responsibilities
- Maintain global Tyk Cloud and define SL(A/I/O)s
- Identify reliability issues and collaborate with the team to resolve them
- Automate common tasks and document operational knowledge
Required Qualifications
- Experience with production scale Kubernetes clusters
- Proficiency in designing and operating infrastructure on AWS and other providers
- Experience operating MongoDB and Redis clusters
- Strong background in administering Linux servers
- Experience with monitoring tools such as Prometheus and Grafana
Comments