Let’s get started
Company Logo

Remote Jobs

Senior Site Reliability Engineer

7/8/2025

No location specified

Job Summary

A company is looking for a Senior Site Reliability Engineer, AI Infrastructure.

Key Responsibilities
  • Develop and maintain large-scale systems for AI Infrastructure, ensuring reliability and scalability
  • Implement SRE fundamentals, including incident management and automation tools to enhance operational efficiency
  • Establish frameworks for operational maturity and lead incident response protocols to improve system resilience
Required Qualifications
  • Degree in Computer Science or related field, or equivalent experience with 12+ years in Software Development, SRE, or Production Engineering
  • Proficiency in Python and at least one additional programming language (C/C++, Go, Perl, Ruby)
  • Expertise in systems engineering within Linux or Windows environments and cloud platforms (AWS, OCI, Azure, GCP)
  • Strong understanding of SRE principles, including error budgets and Infrastructure as Code tools
  • Hands-on experience with observability platforms and CI/CD systems

Comments

No comments yet. Be the first to comment!