Back to Jobs

Data Engineer - Healthcare

Remote, USA Full-time Posted 2025-11-24
Who we are Percepta’s mission is to transform critical institutions with applied AI. We care that industries that power the world (e.g. healthcare, manufacturing, energy) benefit from frontier technology. To make that happen, we embed with industry-leading customers to drive AI transformation. We bring together: • Forward-deployed expertise in engineering, product, and research • Mosaic, our in-house toolkit for rapidly deploying agentic workflows • Strategic partnerships with Anthropic, McKinsey, AWS, companies within the General Catalyst portfolio, and more Our team is a quickly growing group of Applied AI Engineers, Embedded Product Managers and Researchers motivated by diffusing the promise of AI into improvements we can feel in our day to day lives. Percepta is a direct partnership with General Catalyst, a global transformation and investment company. About the role We’re hiring a Data Engineer to partner closely with our product and AI engineering teams to lead AI transformations of large health systems & healthcare providers (e.g., Summa Health). You will help design, build, and operationalize the data systems that power AI applications, while informing decisions on data models, infrastructure, and pipeline architecture. You’ll work hands-on in a complex provider data environment (Epic, claims, scheduling, imaging metadata, SDOH, payer data), helping bring order to messy systems and enabling AI engineers to build high-impact AI workflows. If you enjoy building in ambiguity, forming technical opinions, and shipping value quickly inside complex environments, this role is for you. In this role, you will: • Shape how data powers AI applications across large health systems by designing the pipelines and models that drive real clinical and operational improvements. • Influence technical strategy for how Percepta deploys AI across dozens of complex health system environments. • Build foundational data systems that become reusable patterns for multiple health-system transformations. • Work directly with clinicians and operational leaders to turn high-value use cases into production-ready data workflows What you’ll do Build and operate production-grade pipelines • Develop end-to-end pipelines across Epic, claims, financial, scheduling, imaging metadata, and other clinical datasets • Ingest from FHIR APIs, HL7 feeds, SFTP drops, flat files, and streaming sources • Build on Databricks or similar platforms for ingestion, transformation, and feature creation • Work across a mix of Azure and AWS systems, with experience keeping pipelines running smoothly through migration periods Enable forward-deployed AI builds • Structure, normalize, and model noisy datasets to support rapid ML/AI development • Navigate fragmented hospital schemas (Epic Clarity/Caboodle, claims tables, ADT feeds, scheduling data) to identify correct sources and relationships. • Shape how data is delivered to models, including features, retrieval schemas, context construction, and embeddings • Build pipelines that support both batch and streaming or near-real-time workflows Be a technical partner in architecture and design • Inform decisions on data models, storage, orchestration, and infra tradeoffs • Diagnose data quality issues, missingness, and schema inconsistencies; propose fixes or alternative approaches. • Balance architectural thinking with forward-deployed delivery — moving quickly while making decisions that scale. Collaborate with cross-functional stakeholders • Work directly with clinicians, operations leaders, IT, and product teams • Translate business and clinical needs into technical data solutions • Contribute to a repeatable data playbook for Percepta’s AI deployments across health systems What we’re looking for Strong technical foundations • Hands-on experience building pipelines on Databricks or similar cloud data platforms • SQL and Python proficiency • Experience with streaming tools (Kafka or comparable) • Experience with both relational databases (Postgres, MySQL) and NoSQL/columnar stores (MongoDB, Dynamo, etc.) • Solid understanding of ETL and ELT patterns, orchestration, CI/CD for data, and schema design Healthcare data experience • Familiarity with FHIR, HL7, Epic data structures, payer and claims datasets • Comfort working with DICOM metadata, scheduling/ADT feeds, SDOH sources, and operational hospital datasets AI and ML intuition • Understanding of what ML systems need: features, embeddings, context windows, and retrieval patterns • Experience structuring data for RAG, retrieval workflows, or agent-style systems • Experience partnering with ML engineers or supporting ML-adjacent pipelines Thrives in ambiguity • Comfortable working inside hybrid cloud environments, or messy enterprise systems • High ownership and ability to operate without perfect requirements • Strong communication skills with comfort being embedded with customer teams Bonus if you have • Experience working on AWS • Prior startup or forward-deployed data engineering experience • Data engineering inside a hospital or payer Apply tot his job Apply To this Job

Similar Jobs

Digital Marketing Campaign Manager

Remote, USA Full-time

Business Process Analyst @ Remote (4 days required per month to be onsite (Middletown) for meetings)

Remote, USA Full-time

IT Cybersecurity Sr Analyst

Remote, USA Full-time

Customer Service Associate job at Par Pacific Holdings in Wailuku, HI

Remote, USA Full-time

Client Relations & Communications Specialist (Calling)

Remote, USA Full-time

Senior Consultant – Data Analytics

Remote, USA Full-time

Operations Engineer, Fleet Reliability

Remote, USA Full-time

Associate Auditor

Remote, USA Full-time

Certified Pharmacy Technician, Amazon Pharmacy

Remote, USA Full-time

Account Manager - Remote - Western USA and Canada

Remote, USA Full-time

KAFKA AZURE DATALAKE ENGINEER WITH SERVICENOW EXPERIENCE(W2) - Virisha LLC

Remote, USA Full-time

Looking for Online English Tutor ? Flexible Hours in Plymouth, MN

Remote, USA Full-time

LVR Representative 1, Charlotte

Remote, USA Full-time

Quality Control Data Review Specialist - GMP , Chromatography

Remote, USA Full-time

Digital Learning Specialist, Learning; Remote

Remote, USA Full-time

Part-Time Retail Sales Associate

Remote, USA Full-time

Outpatient Clinical Documentation Specialist (Remote)

Remote, USA Full-time

Customer Support Specialist - Part-time Remote Opportunity with Industry-Leading Verification Platform

Remote, USA Full-time

Manager Clinical Care Coordination ALTCS - Pima, Cochise, Santa Cruz Counties, AZ

Remote, USA Full-time

Lead Data Governance Consultant

Remote, USA Full-time