Let’s get started
Company Logo

Remote Jobs

Data Engineer

6/26/2025

Remote

Job Summary

A company is looking for a Member of Technical Staff, Pre-Training Data Engineer.

Key Responsibilities
  • Design and build scalable data pipelines for diverse datasets, ensuring effective ingestion, cleaning, filtering, and optimization
  • Conduct data ablations to assess quality and experiment with data mixtures to enhance model performance
  • Develop robust data modeling techniques to structure datasets for optimal training efficiency
Required Qualifications
  • Strong software engineering skills with proficiency in Python
  • Familiarity with data processing frameworks such as Apache Spark, Apache Beam, or Pandas
  • Experience working with large-scale datasets, including web and multilingual data
  • Knowledge of data quality assessment techniques
  • A passion for bridging research and engineering in AI model training

Comments

No comments yet. Be the first to comment!