Big Data Engineer || Remote || W2 & C2C
Job Description :
We are seeking an experienced Big Data Engineer (EL3 Level) with strong expertise in Apache Spark and Scala to design, develop, and optimize large-scale data processing solutions in the Healthcare domain. The ideal candidate will work on building scalable data pipelines, integrating diverse healthcare datasets (claims, EMR/EHR, provider, member data), and enabling analytics and reporting solutions while ensuring HIPAA compliance and data security.
This is a fully remote opportunity supporting enterprise healthcare data platforms.
Key Responsibilities
Big Data Engineering
• Design and develop distributed data processing pipelines using Apache Spark with Scala.
• Build batch and real-time data pipelines using Spark Core, Spark SQL, and Spark Streaming.
• Optimize Spark jobs for performance tuning (partitioning, caching, broadcast joins, memory management).
Healthcare Data Integration
• Process and transform healthcare datasets including:
• Claims data (837/835)
• EHR/EMR data
• Member & Provider data
• HL7/FHIR formats
• Ensure data quality, validation, and compliance with healthcare regulations (HIPAA).
Cloud & Data Platform
• Work on cloud-based big data platforms (AWS/Azure/Google Cloud Platform).
• Develop data pipelines using:
• Data lakes (S3/ADLSS)
• Hive/Delta Lake/Iceberg
• Kafka for streaming
• Implement CI/CD for data pipelines.
Data Modeling & Warehousing
• Design scalable data models (star/snowflake schema).
• Implement ETL/ELT frameworks.
• Support analytics and reporting teams with optimized datasets.
Governance & Security
• Implement data masking, encryption, and PHI protection strategies.
• Collaborate with compliance teams to ensure regulatory standards.
Required Qualifications
• 12+ years of IT experience in Big Data Engineering.
• Strong hands-on experience in:
• Apache Spark
• Scala
• Experience with distributed data processing and cluster tuning.
• Strong SQL knowledge and data modeling experience.
• Experience working with healthcare datasets.
• Experience with:
• Hadoop ecosystem (Hive, HDFS)
• Kafka
• Airflow
• Cloud experience (AWS/Azure/Google Cloud Platform).
Preferred Qualifications
• Experience with:
• Databricks
• Delta Lake
• Experience implementing Data Lakehouse architecture.
• Knowledge of FHIR/HL7 standards.
• Experience in real-time healthcare analytics.
• Certification in AWS/Azure Data Engineering.
Apply tot his job
Apply To this Job