**The candidate will need to come onsite on the first day to collect equipment.** **All candidates must be local to the Triangle region of North Carolina, and posting may require up to 1-2 days per month in a Triangle area office for meetings.**
Description:
Transportation Web Systems Team seeks an Azure Databricks Engineer who will work with existing staff to plan and design ETL pipelines and product solutions using Azure Databricks.
The person filling this role will create resilient processes to ingest data from a variety of on-prem and cloud transactional databases and APIs. Responsibilities will also include developing business requirements, facilitating change management documentation, and actively collaborating with stakeholders.
This individual will work closely with a development technical lead and discuss all aspects of the design and planning with the development team.
Roles and Responsibilities
Research and engineer repeatable and resilient ETL workflows using Databricks notebooks and Delta Live Tables for both batch and stream processing
Collaborate with business users to develop data products that align with business domain expectations Work with DBAs to ingest data from cloud and on-prem transactional databases
Contribute to the development of the Data Architecture for NC DIT - Transportation: By following practices for keeping sensitive data secure
By streamlining the development of data products for use by data analysts and data scientists
By developing and maintaining documentation for data engineering processes
By ensuring data quality through testing and validation
By sharing insights and experiences with stakeholders and engineers throughout DIT - Transportation
Skills:
Excellent interpersonal skills as well as written and communication skills.
Able to write clean, easy-to-follow Databricks notebook code
Deep knowledge of data engineering best practices, data warehouses, data lakes, and the Delta Lake architecture
Good knowledge of Spark and Databricks SQL/PySpark
Technical experience with Azure Databricks and cloud providers like AWS, Google Cloud, or Azure
In-depth knowledge of OLTP and OLAP systems, Apache Spark, and streaming products like Azure Service Bus
Good practical experience with Databricks Delta Live Tables
Knowledge of object-oriented languages like C#, Java, or Python