✌️About the job

We are looking for a talented Data Engineer with a focus on scaling efficient distribution workloads. You will work alongside a growing multidisciplinary team of talented research scientists and machine learning engineers to improve and scale the efficiency within our computing capacity.  Stability AI operates a very large HPC cluster for training foundational AI models across several modalities.In this role, you will contribute to groundbreaking projects that redefine visual storytelling through advanced generative modeling technique and optimize and manage large-scale distributed workloads to drive project efficiency and success


  • Clean, normalize, and preprocess data in a scalable, parallelizable way to prepare it for ingestion into our machine learning model training pipelines while ensuring of data quality
  • Building and maintaining highly scalable distributed workloads
  • Build data pipelines to ingest and process data (e.g. images and text) for feeding into ML models
  • AWS Resource Management
  • Keep up-to-date with papers / methods regarding how to improve data quality and/or curate data for Image, Video, LLMs etc.


  • Proven background within large scale distributed workloads
  • Experience with large scale data loading for machine learning training runs
  • Experience with cloud storage and file systems. AWS (S3) is strongly preferred, but open to other cloud platforms
  • Experience with Python + Pytorch, Deep learning, Computer Vision
  • Experience with multiprocessing and multithreading python workloads.
  • Experience with parallel dataframe manipulation using Pyspark/Ray
  • Proficiency in HPC cluster management tools and technologies
  • Excellent communication skills to effectively collaborate with users, solve issues, and provide guidance.
  • Attention to detail and the ability to document processes and solutions effectively.
  • Nice to have: Experience with data loading stack (Webdataset, Torchdata, fsspec, AIstore)

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

Print Job Listing
Job Overview
We use cookies to improve your experience on our website. By browsing this website, you agree to our use of cookies.

Sign in

Sign Up

Forgotten Password