Senior Data Engineer
Apply NowJob details
Senior Data Engineer £576 per day Umbrella London – Hybrid (2 / 3 days per week) 6 Month Contract Our client is currently searching for a Senior Data engineer to support with their team in London. Responsibilities • Execute migration of raw and derived datasets between on-prem and cloud data locations (e.g. GCP, Azure, AWS). Datasets magnitude vary between small scale (Gb) up to large scale (Tb). • Ensure consistency between the data ingested and the data manifests. • Organise raw and derived data into appropriate hierarchies. • Collaborate with AI/ML engineers and product managers to o Develop data pipelines for incoming batch data and update existing pipelines where necessary. o Design and implement well decoupled, modularized, reusable, and scalable scripts and code for the retrieval and pre-processing of large-scale histopathology images into the AI/ML pipeline (i.e. each one with order of magnitude of gigabytes) • Document data flows and ingestion pipelines, data use and re-use • Implement data flows to connect operational systems, data for analytics and business intelligence (BI) systems (e.g. Power-BI) • Ensure completion of requisite documentation i.e. ingestion form and any related IHD documentation • Track & report completion of data migration to AIML & Onyx stakeholders and raise blockers preventing migration. [Non-comp path requirements] • Migrate ML pipelines from on-prem HPC solutions to the cloud. • Migrate ML pipelines between cloud environments and across cloud computing providers. • Optimise and parallelise said ML pipelines for scalability, speed and cost efficiency. Experience : • 5 years of work experience as a professional data/software engineer. • Machine learning experience / background • CICD experience • Expert level and industrial experience in design, development and deployment of data engineering pipelines. • Advanced programming expertise in Python and in developing and delivering robust software solutions. • Advanced programming expertise in SQL and/or similar database languages. • Experience with cloud platforms, such as Google Cloud Platform, Azure, AWS (preference GCP) • Experience in handling big data at scale. • Experience with large-size images and data formats for computational pathology would be a plus (e.g. .svs, .tiff, .h5). • Experience with business intelligence platforms, e.g. Power-BI
Apply Now