Data Engineering Architect/ Remote (Vancouver, WA),3+ Months Contract
Job Description
The data engineering role is a team member that will help enhance and maintain the Instant Ink Business Intelligence system. You will drive work you're doing to completion with hands-on development responsibilities, and partner with the Data Engineering leaders to implement data engineering pipelines to build solution to help provide trusted and reliable data to customers.
Responsibilities
• Design and implement distributed data processing pipelines using Spark, Python, SQL and other tools and languages prevalent in the Big Data/Lakehouse ecosystem.
• nalyzes design and determines coding, programming, and integration activities required based on general objectives.
• Reviews and evaluates designs and project activities for compliance with architecture, security and quality guidelines and standards
• Writes and executes complete testing plans, protocols, and documentation for assigned portion of data system or component; identifies defects and creates solutions for issues with code and integration into data system architecture.
• Collaborates and communicates with project team regarding project progress and issue resolution.
• Works with the data engineering team for all phases of larger and more-complex development projects and engages with external users on business and technical requirements.
• Collaborates with peers, engineers, data scientists and project team.
• Typically interacts with high-level Individual Contributors, Managers and Program Teams on a daily/weekly basis.
• What you bring :
• Bachelor's or Master's degree in Computer Science, Information Systems, Engineering or equivalent.
• 6+ years of relevant experience with detailed knowledge of data warehouse technical architectures, infrastructure components, ETL/ ELT and reporting/analytic tools.
• 3+ years of experience with Cloud based DW such as Redshift, Snowflake etc.
• 3+ years' experience in Big Data Distributed ecosystems (Hadoop, SPARK, Hive & Delta Lake)
• 3+ years experience in Workflow orchestration tools such as Airflow etc.
• 3+ years' experience in Big Data Distributed systems such as Databricks, AWS EMR, AWS Glue etc.
• Leverage monitoring tools/frameworks, like Splunk, Grafana, CloudWatch etc.
• Experience with container management frameworks such as Docker, Kubernetes, ECR etc.
• 3+ year's working with multiple Big Data file formats (Parquet, Avro, Delta Lake)
• Experience working on CI/CD processes such as Jenkins, Codeway etc. and source control tools such as GitHub, etc.
• Strong experience in coding languages like Python, Scala & Java
• Knowledge and Skills
• Fluent in relational based systems and writing complex SQL.
• Fluent in complex, distributed and massively parallel systems.
• Strong analytical and problem-solving skills with ability to represent complex algorithms in software.
• Strong understanding of database technologies and management systems.
• Strong understanding of data structures and algorithms
• Database architecture testing methodology, including execution of test plans, debugging, and testing scripts and tools.
• Strong analytical and problem-solving skills.
• Nice to Have
• Experience with transformation tools such as dbt.
• Have experience in building realtime streaming data pipelines
• Experience in pub/sub streaming technologies like Kafka, Kinesis, Spark Streaming etc.
Apply tot his job
Apply To this Job