Job Summary
A company is looking for a Machine Learning Engineer, Ads Training Platform.
Key Responsibilities
- Design, build, and maintain large-scale distributed training infrastructure for Ads ML models
- Develop tools and frameworks on top of the Ray platform
- Collaborate with ML engineers to improve model training time, efficiency, and GPU training costs
Required Qualifications
- 3+ years in infrastructure/platform engineering or large-scale distributed systems
- 2+ years hands-on experience with Ray platform
- Strong understanding of distributed computing principles
- Experience with distributed storage systems and large-scale data processing
- Experience with deep learning frameworks (PyTorch, TensorFlow) is a plus
Comments