Back to Jobs

Big Data Engineer || Remote || W2 & C2C

Remote, USA Full-time Posted 2025-11-24
Job Description : We are seeking an experienced Big Data Engineer (EL3 Level) with strong expertise in Apache Spark and Scala to design, develop, and optimize large-scale data processing solutions in the Healthcare domain. The ideal candidate will work on building scalable data pipelines, integrating diverse healthcare datasets (claims, EMR/EHR, provider, member data), and enabling analytics and reporting solutions while ensuring HIPAA compliance and data security. This is a fully remote opportunity supporting enterprise healthcare data platforms. Key Responsibilities Big Data Engineering • Design and develop distributed data processing pipelines using Apache Spark with Scala. • Build batch and real-time data pipelines using Spark Core, Spark SQL, and Spark Streaming. • Optimize Spark jobs for performance tuning (partitioning, caching, broadcast joins, memory management). Healthcare Data Integration • Process and transform healthcare datasets including: • Claims data (837/835) • EHR/EMR data • Member & Provider data • HL7/FHIR formats • Ensure data quality, validation, and compliance with healthcare regulations (HIPAA). Cloud & Data Platform • Work on cloud-based big data platforms (AWS/Azure/Google Cloud Platform). • Develop data pipelines using: • Data lakes (S3/ADLSS) • Hive/Delta Lake/Iceberg • Kafka for streaming • Implement CI/CD for data pipelines. Data Modeling & Warehousing • Design scalable data models (star/snowflake schema). • Implement ETL/ELT frameworks. • Support analytics and reporting teams with optimized datasets. Governance & Security • Implement data masking, encryption, and PHI protection strategies. • Collaborate with compliance teams to ensure regulatory standards. Required Qualifications • 12+ years of IT experience in Big Data Engineering. • Strong hands-on experience in: • Apache Spark • Scala • Experience with distributed data processing and cluster tuning. • Strong SQL knowledge and data modeling experience. • Experience working with healthcare datasets. • Experience with: • Hadoop ecosystem (Hive, HDFS) • Kafka • Airflow • Cloud experience (AWS/Azure/Google Cloud Platform). Preferred Qualifications • Experience with: • Databricks • Delta Lake • Experience implementing Data Lakehouse architecture. • Knowledge of FHIR/HL7 standards. • Experience in real-time healthcare analytics. • Certification in AWS/Azure Data Engineering. Apply tot his job Apply To this Job

Similar Jobs