Back to Jobs

Python Developers - US

Remote, USA Full-time Posted 2025-11-24
Work Location: Remote, within the US Engagement Model: Freelancer/Independent Contractor Start Date: ASAP DataForce by TransPerfect is looking for skilled Python Developers to architect, build, and own the data pipelines that power large language model (LLM) development. Your primary mission will be to build scalable, automated systems that transform massive raw datasets into clean, model-ready formats. While your focus will be on data engineering, your expertise will also be valuable in collaborating on model training runs and experiments. You are a strong fit for this role if you are a Python expert who thrives on solving large-scale data challenges and enjoys working at the intersection of data engineering and machine learning. Role Responsibilities • Design, develop, and own robust, scalable, and automated ETL/ELT pipelines in Python to ingest and process terabyte-scale text datasets. • Implement rigorous data cleaning, deduplication, filtering, and normalization strategies, and define and enforce data quality standards to ensure high integrity for model training. • Efficiently structure and format diverse datasets (e.g., JSON, Parquet) for consumption by LLM training frameworks. • Work closely with AI researchers and ML engineers to understand data requirements, define metrics, and support the model training lifecycle. • Continuously optimize data processing workflows for performance, cost efficiency, and reliability. • Occasionally assist with launching, monitoring, and debugging data-related issues during model training runs. Role Requirements • 5–10 years of professional experience in Python development, data engineering, data processing, or backend software engineering. • Expert-level proficiency in Python and its data ecosystem (e.g., Pandas, NumPy, Dask, Polars). • Proven experience building and maintaining large-scale data pipelines. • Deep understanding of data structures, data modeling, and software engineering best practices (Git, CI/CD, testing). • Experience handling and parsing diverse data formats (JSON, CSV, XML, Parquet) at scale. • Excellent problem-solving skills and a meticulous attention to detail. • Strong communication and collaboration skills, with experience working in a team environment. Preferred Role Requirements • Hands-on experience with the data preprocessing pipeline for an LLM (e.g., LLaMA, BERT, GPT-family). • Experience with big data frameworks like Apache Spark or Ray. • Experience with Hugging Face libraries (Transformers, Datasets, Tokenizers). • Familiarity with ML frameworks like PyTorch or TensorFlow. • Proficiency with cloud platforms (AWS, GCP, Azure) and their data/storage services. DataForce by TransPerfect is part of the TransPerfect family of companies, the world’s largest provider of language and technology solutions for global business, with offices in more than 100 cities worldwide. We offer high-quality data for Human-Machine Interaction to some of the most prestigious technology companies in the world. Our department focuses on gathering, enriching and processing data for Machine Learning in different AI domains. To learn more about DataForce please visit us at https://www.transperfect.com/dataforce. TransPerfect provides equal employment opportunity to all individuals regardless of their race, color, creed, religion, gender, age, sexual orientation, national origin, disability, veteran status, or any other characteristic protected by state, federal, or local law. For more information on the TransPerfect Family of Companies, please visit our website at www.transperfect.com. Remote About the Company: DataForce by TransPerfect Apply tot his job Apply To this Job

Similar Jobs

Agency Transactions Analyst

Remote, USA Full-time

Junior Project Manager

Remote, USA Full-time

Financial Professional – Retirement Benefits Group

Remote, USA Full-time

Business Development Representative

Remote, USA Full-time

Licensing Specialist

Remote, USA Full-time

Fund Accounting Analyst

Remote, USA Full-time

Onboarding Consultant 12/24

Remote, USA Full-time

Sr. Manager, Communications and Design Services Boston, MA

Remote, USA Full-time

Insurance Defense Attorney - New York City

Remote, USA Full-time

Zurich Underwriting Associate Program - Montreal

Remote, USA Full-time

Customer Support Agent (Remote)_Night Shift

Remote, USA Full-time

Sales Development Representative (Part-Time, Remote, Commission-Only)

Remote, USA Full-time

Food & Nutrition Education Fund Coordinator​/Training & Development Representative), HR Dietary

Remote, USA Full-time

Entry-Level Data Entry Specialist for blithequark's Live Chat Support Team - Music, Technology, and Career Growth

Remote, USA Full-time

Special Education Teacher/ K-8

Remote, USA Full-time

Cloud Consultant, ProServe Variable WWPS

Remote, USA Full-time

Broker Dealer Vendor Manager Specialist

Remote, USA Full-time

Experienced Ramp Agent – Full-Time Ground Handling and Fleet Service Professional at American Airlines $25/Hour

Remote, USA Full-time

**Experienced Full Stack Data Entry Specialist – Remote Data Management and Customer Service**

Remote, USA Full-time

[Work From Home] Looking for Math Instructor / Tutor in Aptos

Remote, USA Full-time