Job Summary
A company is looking for a Member of Technical Staff, Integration/RL Team (Research Engineer).
Key Responsibilities
- Design and write high-performing and scalable software for training models
- Develop new tools to support and accelerate research and LLM training
- Coordinate with engineering and scientific teams to create an integrated post-training ecosystem
Required Qualifications
- Extremely strong software engineering skills
- Proficiency in Python and related ML frameworks such as JAX, Pytorch, and/or XLA/MLIR
- Experience using and debugging large-scale distributed training strategies
- Bonus: Experience with distributed training infrastructures (Kubernetes) and associated frameworks (Ray)
- Bonus: Hands-on experience with the post-training phase of model training
Comments