Remote Jobs

Site Reliability Engineer

7/11/2025

No location specified

Job Summary

A company is looking for a Site Reliability Engineer.

Key Responsibilities

Deploy clusters of 1,000+ GPUs and modify tools for customer solutions
Validate and optimize compute, storage, and networking infrastructure
Debug production issues and build internal tooling to enhance deployment efficiency

Required Qualifications

2+ years of experience in SRE, DevOps, Sysadmin, or HPC engineering
Experience deploying and operating Kubernetes and/or SLURM clusters
Proficiency in Go, Python, and Bash programming languages
Familiarity with automation tools like Ansible and Terraform
Strong engineering background in Computer Science, Software Engineering, Math, or related fields

Comments

No comments yet. Be the first to comment!

Similar Jobs

Senior Java Engineer

6/29/2025

Remote Jobs

Senior Full Stack Engineer

7/3/2025

Remote Jobs

Software Engineer with Security Clearance

7/8/2025

Remote Jobs

React Native Developer

7/2/2025

Remote Jobs

7/4/2025

Remote Jobs

Java Full Stack Engineer

7/3/2025

Remote Jobs

Senior Backend Software Engineer

6/28/2025

Remote Jobs

Mobile Tech Lead

7/3/2025

Remote Jobs

Full Stack Software Engineer

6/28/2025

Remote Jobs

Senior Software Engineer

7/1/2025

Remote Jobs

Cloud Software Development Director

7/3/2025

Remote Jobs

Software Design Engineer

7/10/2025

Remote Jobs

Careermilard Announces FULL TIME Jobs For Netflix From Home

7/7/2025

Remote Jobs