Job Summary
A company is looking for a Principal Engineer, Inference Service.
Key Responsibilities
- Design and implement an inference platform for serving large language models optimized for various GPU platforms
- Develop and manage AI and cloud engineering projects through the entire product development lifecycle
- Optimize runtime and infrastructure layers of the inference stack for enhanced model performance
Required Qualifications
- 10+ years of software engineering experience, including 2+ years in AI/ML technologies related to LLM hosting and inference
- Deep expertise in cloud computing platforms and modern AI/ML technologies
- Experience with modern LLMs, particularly in hosting, serving, and optimizing
- Proficiency in programming languages such as Python and Go
- Experience with infrastructure as code (IaC) tools like Terraform or Ansible
Comments