Job Summary
A company is looking for a Senior Research Engineer - Multimodal & Video Foundation Model.
Key Responsibilities
- Pioneer multimodal and video-centric research, contributing to usable prototypes and scalable systems
- Design and implement novel AI architectures for multimodal language models
- Engineer scalable training and inference pipelines optimized for large-scale multimodal datasets
Required Qualifications
- Bachelor's degree in Computer Science, Computer Engineering, or a related technical field, or equivalent practical experience
- Expertise in Python & Pytorch, with experience in the full development pipeline
- Experience working with large-scale text data or interleaved data spanning audio, video, image, and/or text
- Direct hands-on experience in developing or benchmarking in LLMs, Vision Language Models, or generative video models
- PhD in a relevant field is a plus
Comments