Let’s get started
Company Logo

Remote Jobs

Platform Engineer

7/29/2025

Remote

Job Summary

A company is looking for a Platform Engineer - AI/ML Infrastructure.

Key Responsibilities
  • Architect and maintain core computing platforms using Kubernetes on AWS and on-premise
  • Develop and manage infrastructure using Infrastructure-as-Code (IaC) principles with Terraform
  • Design, build, and optimize AI/ML job scheduling and orchestration systems integrating Slurm with Kubernetes clusters
Required Qualifications
  • 5+ years of experience in Platform Engineering, DevOps, or Site Reliability Engineering (SRE)
  • Hands-on experience building and managing production infrastructure with Terraform
  • Expert-level knowledge of Kubernetes architecture and operations in large-scale environments
  • Experience with high-performance compute (HPC) job schedulers, specifically Slurm
  • Experience managing bare metal infrastructure, including server provisioning and lifecycle management

Comments

No comments yet. Be the first to comment!