Job Summary
A company is looking for a Lead Data Engineer.
Key Responsibilities
- Design and build large-scale data pipelines using Python, PySpark, and Scala
- Optimize Spark jobs for performance in GCP environments
- Develop and maintain production-ready workflows using orchestration tools like KFP or Airflow
Required Qualifications
- Bachelor's degree in Computer Science, Engineering, or a related field or equivalent experience
- 5-20+ years of experience in data engineering with a focus on Python, PySpark, and/or Scala
- Expertise in Spark performance tuning and optimization in cloud-based environments
- Hands-on experience with workflow orchestration tools like KFP or Airflow
- Proficiency with Docker and container-based deployment strategies
Comments