Urgent role-Databricks Engineer[Hybrid]-[education Background]
Position: Databricks Engineer
Location: Adelphi, Maryland
Visa: Any Visa except CPT, TN and GC (PP number is mandatory to fetch I94)
Interview: webcam interview
Responsibilities:
1. Data & AI Platform Engineering (Databricks-Centric):
• Design, implement, and optimize end-to-end data pipelines on Databricks, following the Medallion Architecture principles.
• Build robust and scalable ETL/ELT pipelines using Apache Spark and Delta Lake to transform raw (bronze) data into trusted curated (silver) and analytics-ready (gold) data layers.
• Operationalize Databricks Workflows for orchestration, dependency management, and pipeline automation.
• Apply schema evolution and data versioning to support agile data development.
2. Platform Integration & Data Ingestion:
• Connect and ingest data from enterprise systems such as PeopleSoft, D2L, and Salesforce using APIs, JDBC, or other integration frameworks.
• Implement connectors and ingestion frameworks that accommodate structured, semi-structured, and unstructured data.
• Design standardized data ingestion processes with automated error handling, retries, and alerting.
3. Data Quality, Monitoring, and Governance:
• Develop data quality checks, validation rules, and anomaly detection mechanisms to ensure data integrity across all layers.
• Integrate monitoring and observability tools (e.g., Databricks metrics, Grafana) to track ETL performance, latency, and failures.
• Implement Unity Catalog or equivalent tools for centralized metadata management, data lineage, and governance policy enforcement.
4. Security, Privacy, and Compliance:
• Enforce data security best practices including row-level security, encryption at rest/in transit, and fine-grained access control via Unity Catalog.
• Design and implement data masking, tokenization, and anonymization for compliance with privacy regulations (e.g., GDPR, FERPA).
• Work with security teams to audit and certify compliance controls.
5. AI/ML-Ready Data Foundation:
• Enable data scientists by delivering high-quality, feature-rich data sets for model training and inference.
• Support AIOps/MLOps lifecycle workflows using MLflow for experiment tracking, model registry, and deployment within Databricks.
• Collaborate with AI/ML teams to create reusable feature stores and training pipelines.
6. Cloud Data Architecture and Storage:
• Architect and manage data lakes on Azure Data Lake Storage (ADLS) or Amazon S3, and design ingestion pipelines to feed the bronze layer.
• Build data marts and warehousing solutions using platforms like Databricks.
• Optimize data storage and access patterns for performance and cost-efficiency.
7. Documentation & Enablement:
• Maintain technical documentation, architecture diagrams, data dictionaries, and runbooks for all pipelines and components.
• Provide training and enablement sessions to internal stakeholders on the Databricks platform, Medallion Architecture, and data governance practices.
• Conduct code reviews and promote reusable patterns and frameworks across teams.
8. Reporting and Accountability:
• Submit a weekly schedule of hours worked and progress reports outlining completed tasks, upcoming plans, and blockers.
• Track deliverables against roadmap milestones and communicate risks or dependencies.
Required Qualifications:
• Hands-on experience with Databricks, Delta Lake, and Apache Spark for large-scale data engineering.
• Deep understanding of ELT pipeline development, orchestration, and monitoring in cloud-native environments.
• Experience implementing Medallion Architecture (Bronze/Silver/Gold) and working with data versioning and schema enforcement in enterprise grade environments.
• Strong proficiency in SQL, Python, or Scala for data transformations and workflow logic.
• Proven experience integrating enterprise platforms (e.g., PeopleSoft, Salesforce, D2L) into centralized data platforms.
• Familiarity with data governance, lineage tracking, and metadata management tools.
Apply tot his job
Apply To this Job