Lead AI Dataloop and Release Engineer

Merlin Labs

Software Engineering, Data Science

Boston, MA, USA

Posted on Jun 26, 2026

Apply now

About Merlin:

Merlin is a venture backed aerospace startup building a non-human pilot to enable both reduced crew and uncrewed flight. Backed by some of the world’s leading investors, Merlin is scaling alongside our customers to begin leveraging autonomy today to solve some of aviation’s biggest challenges.

About you:

You are a software leader who thrives in enabling deployment of Next-gen AI models through a comprehensive data strategy for AI model training, simulation and deployment .You have a technically grounded thorough appreciation that in autonomous systems, the quality of AI model training, simulation and release infrastructure is inseparable from the Performance and safety of what you ship. You've built the data flywheels and worked closely with provisioning training clusters, as well as data-driven, physics-based and high-fidelity simulators. You know what it takes to engineer the data pipeline that makes simulated environments realistic enough to deploy AI models confidently in safety critical environments like aviation and automotive space. You are organized, methodical, and skilled at building systems that other engineers rely on every day.

Responsibilities:

Define and execute a comprehensive data strategy that spans AI model training, simulation, and production deployment across safety-critical autonomous systems.

Own the end-to-end data pipeline — from raw collection and labeling through curation, versioning, and delivery — ensuring the reliability and scale that training and simulation workflows demand.

Build and maintain data flywheels that continuously improve model performance by closing the loop between deployed system behavior and future training iterations.

Collaborate closely with teams provisioning and operating large-scale GPU/TPU training clusters to align data delivery with compute capacity and training schedules.

Drive the design and integration of data pipelines that feed data-driven, physics-based, and high-fidelity simulators, ensuring simulated environments are realistic enough to support confident AI model validation.

Partner with safety, validation, and certification teams to establish data quality standards and traceability practices that satisfy regulatory requirements in aviation and/or automotive domains.

Lead, mentor, and grow a team of data and infrastructure engineers, setting technical direction and fostering a culture of rigor, ownership, and continuous improvement.

Define and track KPIs for data pipeline health, simulation fidelity, and model readiness, using these metrics to prioritize investments and communicate progress to senior leadership.

Qualifications:

10+ years of engineering experience, with at least 4 years in a technical leadership role owning data infrastructure, MLOps, or AI platform engineering at scale.

Demonstrated experience building and operating data pipelines for AI/ML model training, including dataset management, labeling workflows, and data versioning at production scale.

Hands-on experience integrating data systems with large-scale distributed training infrastructure (e.g., GPU/TPU clusters, job orchestration, experiment tracking).

Deep understanding of simulation pipelines — including data-driven, physics-based, or sensor-realistic simulators — and how data quality directly impacts simulator fidelity and model transferability.

Experience working in or alongside safety-critical domains (autonomous vehicles, aviation, robotics, or similar) with an understanding of what rigor, traceability, and validation mean in that context.

Strong systems-thinking mindset: you reason about data quality, pipeline reliability, and infrastructure design as interconnected constraints, not isolated problems.

Track record of building platforms and tooling that other engineering teams depend on day-to-day, with a high bar for reliability, documentation, and developer experience.

Excellent cross-functional communication skills — able to translate technical data strategy into clear priorities for product, safety, and executive stakeholders.

Nice to Have:

Experience with domain randomization, synthetic data generation, or sensor simulation techniques used to bridge the sim-to-real gap in autonomous systems.

Familiarity with aviation-specific standards (e.g., DO-178C, DO-254) or automotive safety frameworks (e.g., ISO 26262, SOTIF) as they relate to data and software validation.

Prior experience building or scaling data flywheel systems — closed-loop pipelines that feed real-world deployment signals back into training and labeling workflows.

Hands-on background with perception, planning, or control model development in autonomous vehicles or UAV/UAS systems.

Experience with formal data governance, lineage tracking, or provenance tooling in regulated environments.

Contributions to open-source tooling in the MLOps, data engineering, or simulation space.

This position is based on-site at Merlin HQ in Boston, MA.

Once you’re here, you’ll enjoy a variety of on-site perks designed to make your workday enjoyable and convenient. These include catered lunches featuring a rotating menu of delicious options, an assortment of snacks to keep you fueled throughout the day, and a selection of beverages, including coffee, tea, and other drinks, to keep you refreshed.

Our goal is to create an environment where you can thrive both professionally and personally

Merlin Labs offers an innovative, entrepreneurial, and team-focused startup environment. We also offer a top-notch benefits package (health, dental, life, unlimited vacation, and 401k with match) and work/life integration. Being part of the Merlin team allows you to become part of a small team that supports professional development while working together to achieve our mission.

Merlin Labs is an equal opportunity employer and values diversity. We do not discriminate on the basis of race, religion, color, national origin, genetic information, sex (including pregnancy), gender, gender identity and expression, sexual orientation, age, marital status, military service or obligation or disability status, or any other characteristic protected by law. All job offers are contingent upon the candidate passing background and reference checks.

At this time, we are unable to provide visa sponsorship or consider candidates who require visa transfers. Applicants must be authorized to work in the United States without the need for visa sponsorship now or in the future.

In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification form upon hire.

If you require reasonable accommodation in completing an application, interviewing, completing any pre-employment testing, or otherwise participating in the employee selection process, please direct your inquiries to: people@merlinlabs.com

Merlin Labs does not accept unsolicited resumes from any source other than directly from candidates.