Job Summary
A company is looking for a Senior AI Research Engineer, Model Inference (100% Remote).
Key Responsibilities
- Implement and optimize custom inference and fine-tuning kernels for language models across multiple hardware backends
- Design and customize Vulkan compute shaders for quantized operators and fine-tuning workflows
- Collaborate with research and engineering teams to prototype, benchmark, and scale new model optimization methods
Required Qualifications
- Proficiency in C++ and GPU kernel programming
- Expertise in GPU acceleration with Vulkan framework
- Strong background in quantization and mixed-precision model optimization
- Experience with mobile GPU acceleration and model inference
- Familiarity with large language model architectures
Comments