Staff Software Engineer, AI Foundation Model
Software Engineering, Data Science
Boston, MA, USA
About you:
You are a senior individual contributor with deep expertise in building the systems that make large-scale machine learning possible in production. You've experience in the selection of optimum foundation models, adopt/tune for target application needs, designed required data pipelines that handle real-world messiness, built training infrastructure that scales reliably, and created evaluation frameworks that give engineers genuine confidence in their development. You care about the craft of software engineering as much as you care about AI models — you know that great AI systems are only as good as the infrastructure underneath them. You enjoy operating with significant autonomy, influencing technical decisions across multiple teams, and mentoring engineers who are earlier in their careers. Most importantly, you want to work on a problem that genuinely matters: putting safe autonomous aircraft into the skies.
Responsibilities:
Technically strategize on Foundation model selection, refactor, design, build, and maintain Merlin's core AI models,training and inference infrastructure, including distributed training pipelines, experiment tracking, and model registry systems.
Define and drive standards and benchmarks for model evaluation, benchmarking, and regression testing to ensure AI systems meet safety and performance thresholds before deployment.
Architect and own the foundational data pipeline that ingests, processes, labels, and versions flight data for use across all autonomy and AI teams.
Identify and resolve systemic technical bottlenecks that slow down AI development velocity across Merlin's engineering organization.
Collaborate with AI infra, Simulation, and Flight Software teams to define interfaces and shared abstractions that make the broader stack more coherent and maintainable.
Mentor, build and technically guide junior and mid-level engineers on the AI Foundation team, conducting design reviews and raising the overall quality of the team's output.
Lead technical scoping and estimation for large AI foundational model projects, breaking ambiguous requirements into actionable engineering plans.
Evaluate and adopt relevant open-source tooling, frameworks, and research, contributing back where appropriate.
Document architecture decisions, system designs, and operational runbooks to a standard that supports safety review and long-term maintainability.
Contribute to hiring by conducting technical interviews and helping define the engineering bar for the AI Foundation team.
Qualifications:
8+ years of software engineering experience, with a substantial portion focused on state of the art and next-gen AI model development, ML infrastructure, MLOps, data engineering, or AI platform development.
Experience in development of the AI-first Autonomy software tech stack that includes Perception, Behavior Planning, Prediction and Actuation ( based on Transformer architectures)
Demonstrated ability to design and deliver large-scale, production-grade systems independently — from initial architecture through deployment and operation.
Deep proficiency in Pytorch, TensorFlow and at least one systems language such as C++,C, Rust, or Go.
Hands-on experience with distributed training frameworks (Python extensions, JAX, or equivalent) and the infrastructure required to run them reliably at scale.
Experience in onboarding trained models into edge environments consisting of High performance compute System-on-Chips.
Strong background in data pipeline design, including streaming and batch processing, data versioning, and handling high-volume, heterogeneous sensor data.
Experience building model evaluation and validation frameworks leveraging MLOps frameworks (like Weights and Biases. MLFlow etc) that go beyond accuracy metrics to assess real-world reliability and safety-relevant behavior.
Proven track record of influencing technical direction across teams and driving alignment on shared infrastructure and standards.
Excellent written communication skills, with the ability to produce clear design documents and architecture proposals that hold up to rigorous review.
Comfort working in a fast-moving startup where the scope of problems and priorities evolve as the product matures.
Nice to Have:
Experience in aerospace/automotive autonomy, robotics, or other safety-critical real-time domains where software quality and reliability standards are exceptionally high.
Development and deployment of AI models on heterogeneous HPCs like NVIDIA Thor/Orin.
Background in simulation infrastructure, synthetic data generation, or domain randomization for training perception and control models using end to end ecosystems like NVIDIA Cosmos etc., .
Experience with onboard inference optimization — model quantization, TensorRT, hardware-aware compilation, or deployment on embedded accelerators.
Contributions to widely used open-source ML or data engineering projects.
Familiarity with flight data formats, avionics data buses (ARINC 429, MIL-STD-1553), or sensor modalities common in aviation (Camera, LiDAR, radar, IMU).
Prior staff or principal engineer experience at a high-growth startup or research-driven