Job Summary
A company is looking for a Site Reliability Engineer to ensure the availability, reliability, and performance of customer-facing software applications.
Key Responsibilities:
- Monitor system health and performance to ensure high availability and reliability of the production environment
- Facilitate incident resolution and provide primary operational support for large-scale distributed software applications
- Develop and maintain documentation on best practices, troubleshooting, and training materials
Required Qualifications:
- Bachelor's Degree in an engineering-related discipline; Master's Degree preferred
- Minimum 5 years of experience supporting network and AV operations
- Experience with cloud-based infrastructure, databases, and applications
- Proficiency with enterprise system monitoring software
- Knowledge of disaster recovery planning and execution
Comments