What MLOps Actually Is (Not Just DevOps with ML Bolted On)
The term MLOps emerged from the collision of machine learning research and software engineering operations. It describes the discipline of making ML systems reliable, reproducible, and maintainable in production. That sounds like DevOps. It is not.
DevOps engineers solve deterministic problems: code is either deployed or it is not; a service is either healthy or it is not. MLOps engineers solve probabilistic problems: a model might be deployed but silently producing worse predictions because the data distribution has shifted since training. The service is healthy. The model is not.
This distinction matters enormously in hiring. A DevOps engineer who has not worked with ML systems will not understand why model monitoring is different from application monitoring, why data pipelines require special handling beyond standard ETL, or why a model registry is not just an artifact store.
The Core Problem with Most MLOps Job Descriptions: They list Kubernetes, Docker, Terraform, and CI/CD — all DevOps skills — and add "experience with ML models preferred." This produces candidates who can deploy models but cannot detect when those models stop working.
The right MLOps engineer understands both sides: they can build reliable infrastructure AND reason about model-specific failure modes. This combination is rare, which is why the role commands a premium and why sourcing from pure DevOps backgrounds rarely works. See how MLOps fits within a broader AI team structure before you start interviewing.
Two Types of MLOps Engineers: Platform-Focused vs. Model-Focused
Before you write a job description, decide which type you need. Hiring the wrong type will not just fail to solve your problem — it will leave you with an engineer who is frustrated because the work does not match their strengths.
Platform-Focused MLOps Engineer
Builds the infrastructure and tooling that ML engineers use
Strengths ✓
- • CI/CD pipelines for model training and deployment
- • Feature store architecture and management
- • Kubernetes and cloud infrastructure
- • Developer experience for ML teams
Weaknesses ✗
- • May not deeply understand model-specific problems
- • Less likely to engage with model quality issues
Best for: Teams where ML engineers are blocked on infrastructure
Model-Focused MLOps Engineer
Owns model quality, monitoring, and the retraining lifecycle
Strengths ✓
- • Model drift detection and alerting
- • Retraining trigger logic and automation
- • A/B testing and shadow deployment
- • Feedback loop architecture
Weaknesses ✗
- • Less likely to build robust infrastructure from scratch
- • May not optimize for developer experience
Best for: Teams with production models suffering quality degradation
Most early-stage teams (first MLOps hire) benefit from a generalist who leans platform but can engage with model quality issues when needed. As the team scales past 10 models in production, the roles typically split. This is the same dynamic described in our guide on which ML roles to hire at each company stage.
Core Skills: What MLOps Engineers Must Know in 2026
These are the skills that distinguish a true MLOps engineer from a DevOps engineer who has worked with ML systems. The first four are non-negotiable. The last two depend on your team's needs.
CI/CD for ML
RequiredNot just deploying code—deploying models. A pipeline that retrains on schedule, validates model quality against a holdout set, and promotes to production only if it passes. Most DevOps engineers can set up CI/CD. Few understand what 'model validation gate' means in practice.
Feature Stores
RequiredUnderstanding the training/serving skew problem and how feature stores solve it. Can they explain why the same feature can have different values at training time vs. serving time? Can they design a feature store schema for a real use case? This is often the most misunderstood concept by candidates from pure DevOps backgrounds.
Model Monitoring
RequiredNot just 'is the model responding?' but 'is the model still producing good outputs?' This means understanding statistical measures of data drift (PSI, KL divergence, Wasserstein distance) and knowing when to alert vs. when to retrain. Candidates should be able to explain what they would monitor in a production recommender system.
Experiment Tracking
RequiredSetting up and maintaining experiment tracking infrastructure (MLflow, W&B) so that researchers can compare runs reproducibly. More importantly: building a model registry with versioning, staging, and promotion workflows. Many candidates use experiment tracking but have never built the surrounding governance layer.
Distributed Training
Role-DependentUnderstanding how to scale training jobs across multiple GPUs or nodes using Ray, Horovod, or native framework parallelism. This is table stakes for any team training large models. Candidates should know the difference between data parallelism and model parallelism, and when each is appropriate.
Data Pipeline Engineering
Role-DependentBuilding reliable, idempotent data pipelines with Airflow, Prefect, or similar. Understanding lineage, retries, and monitoring. This overlaps with data engineering but MLOps engineers need to consume these pipelines and sometimes build the final transformation steps.
The MLOps Tools Landscape in 2026
The MLOps tooling ecosystem has matured significantly. Candidates who joined the field before 2022 may have primarily used custom scripts and early Kubeflow. The 2026 landscape is richer and more opinionated. Here is what matters — and how to use this in interviews.
How to use this in interviews: Do not quiz candidates on specific tool APIs. Instead, ask them to compare two tools in a category (e.g., "Why would you choose Prefect over Airflow for this use case?"). This reveals understanding of trade-offs, not just familiarity.
Experiment Tracking & Model Registry
Must-KnowTrack hyperparameters, metrics, and artifacts across training runs. Manage model versions and promotion workflows.
Pipeline Orchestration
Must-KnowDefine and run reproducible ML workflows — data ingestion, preprocessing, training, evaluation, deployment.
Data Pipelines
Must-KnowSchedule and monitor data workflows. MLOps engineers consume these pipelines; sometimes they build parts of them.
Feature Stores
Serve consistent features for training and inference. Solve the training/serving skew problem at scale.
Distributed Training
Scale training across multiple GPUs or nodes. Required for teams training large models.
Model Serving
Must-KnowServe models at low latency with batching, versioning, and A/B testing support.
Overrated: Kubernetes Alone
Kubernetes expertise is necessary but not sufficient. A candidate whose entire MLOps pitch is "I run everything on Kubernetes" will build infrastructure-heavy solutions that ML engineers cannot use independently. The goal is ML developer experience, not infrastructure complexity.
Overrated: Theoretical ML Knowledge
An MLOps engineer does not need to know how to train a transformer from scratch. They need to know what happens when a transformer serving latency spikes by 300ms, or when the embedding distribution shifts. Operational intuition matters more than theoretical depth.
Red Flags That Most Interviewers Miss
These are the signals that indicate a candidate is a DevOps engineer in disguise, or an ML engineer who has never operated systems at production scale.
Cannot explain model drift ✗
Ask: "Describe the difference between data drift and concept drift, and how you would detect each in production." A qualified candidate will distinguish between input distribution shift (data drift) and the change in the relationship between inputs and outputs (concept drift), and describe specific monitoring approaches for each. Vague answers like "the model gets stale" are a dealbreaker.
Never deployed to production ✗
This seems obvious, but many candidates have contributed to ML pipelines without owning end-to-end deployment. Ask: "Walk me through the last model you took from a training script to a live endpoint serving real traffic. What broke? How did you debug it?" Candidates without this experience will give process descriptions instead of war stories.
Treats retraining as the default fix ✗
When you describe a model performance degradation scenario, does their first instinct jump to "retrain the model"? Retraining is often the wrong answer — data pipeline failures, feature store staleness, or upstream schema changes are more common causes. An experienced MLOps engineer has a diagnostic runbook before they consider retraining.
No experience with training/serving skew ✗
Ask: "Give me an example where training and serving computed the same feature differently. What caused it and how did you fix it?" This is one of the most common and damaging failure modes in production ML. Engineers who have not encountered it will give theoretical answers.
The 3-Stage Technical Interview for MLOps Engineers
Standard software engineering interviews do not surface the skills that matter for MLOps. This 3-stage process is designed specifically for the role, mapping each stage to a core dimension of the job.
Infrastructure Design
Design an end-to-end ML platform for a given scenario. Evaluate how they think about the full lifecycle, not just deployment.
Green Flags ✓
- • Asks about team size, model count, and deployment frequency
- • Mentions feature store, monitoring, and registry — not just Kubernetes
- • Designs for iteration speed, not just production stability
- • Considers data quality and training pipeline reliability
Red Flags ✗
- • Starts with Kubernetes cluster setup as step 1
- • No mention of experiment tracking or model registry
- • Treats MLOps as pure DevOps with ML bolted on
- • No discussion of model monitoring or retraining triggers
Production Debugging
Give a real scenario: model accuracy dropped 8% in production after a stable 6-month period. Walk through their diagnostic process.
Green Flags ✓
- • Distinguishes between data drift, concept drift, and model staleness
- • Checks data pipeline health before blaming the model
- • Has a structured runbook-style approach to debugging
- • Mentions specific monitoring tools and what signals they look for
Red Flags ✗
- • Jumps to retraining as the first response
- • Cannot explain the difference between data drift and model staleness
- • Has no structured debugging process — just 'check logs'
- • No mention of upstream data pipeline failures as a cause
Code & Pipeline Review
Review a provided training pipeline with intentional issues. Evaluate their ability to spot MLOps-specific problems, not just coding style.
Green Flags ✓
- • Spots training/serving skew in feature computation
- • Identifies lack of model validation gate before promotion
- • Questions reproducibility (random seeds, data versioning)
- • Suggests adding monitoring hooks in the serving code
Red Flags ✗
- • Only comments on code style and variable names
- • Misses the missing model validation step
- • Cannot explain why training/serving skew is a problem
- • No suggestions for improving observability
The full process takes 4–6 hours of candidate time spread across 2–3 weeks. This is the right investment for a role that will sit between your ML engineers and your production infrastructure. For context on how we approach candidate vetting at speed, see how we source fast without cutting corners.
Compensation Benchmarks: US, UK, and Remote (2026)
MLOps engineers command a premium over general software engineers — typically 15–25% above equivalent seniority backend roles. Supply is constrained because the role requires both infrastructure depth and ML domain knowledge. Here are current market ranges.
| Level | US (Total Comp) | UK (Base) | Remote | YoE |
|---|---|---|---|---|
| Mid-Level MLOps Engineer | $160K–$220K total comp | £85K–£130K base | $110K–$150K | 3–6 years |
| Senior MLOps Engineer | $220K–$300K+ total comp | £130K–£175K base | $140K–$190K | 6–10 years |
| Staff / Principal MLOps | $300K–$450K+ total comp | £175K–£240K base | $180K–$240K | 10+ years |
Ranges represent base salary plus equity for US; base salary for UK. Remote rates apply to roles outside major tech hubs (NYC, SF, London). Rates shift quarterly — validate against current market data before making offers.
When You Need a Full MLOps Team vs. One Generalist
One of the most expensive hiring mistakes we see: companies build a 3-person MLOps team before they have enough models in production to justify it. The team builds impressive infrastructure that no one uses at scale, and then gets reorganized or cut.
Hire One Generalist When:
- • Fewer than 5 ML engineers on the team
- • Fewer than 10 models deployed to production
- • ML infrastructure still being established
- • Models are similar in type (e.g., all tabular, all NLP)
- • Deployment frequency is low (weekly or less)
- • Budget constraints are real
Move to a Team When:
- • 10+ models in active production
- • Multiple ML teams blocked on shared infrastructure
- • Model quality issues consuming 20%+ of ML engineers' time
- • Mixed model types requiring different serving infrastructure
- • Deployment frequency is daily or continuous
- • Regulatory or compliance requirements for model governance
When you do scale to a team, the typical first split is: one platform lead (infrastructure, CI/CD, tooling) and one model ops specialist (monitoring, retraining, quality). The platform lead is usually the senior hire; the model ops specialist can be a strong mid-level engineer with ML background.
Writing an MLOps Job Description That Attracts the Right Candidates
The language you use in a job description signals to candidates what the role actually is. A job description that reads like a DevOps role will attract DevOps engineers. Here is what to include — and what to omit.
Include ✓
- • "Model monitoring and drift detection"
- • "Feature store design and maintenance"
- • "ML experiment tracking and model registry"
- • "Training pipeline reliability and reproducibility"
- • "Experience deploying models, not just services"
- • Specific tools: MLflow/W&B, Kubeflow, Airflow
- • "Collaborate with ML engineers on production readiness"
Omit or Deprioritize ✗
- • "5+ years of Kubernetes experience" as a top requirement
- • Terraform expertise as a headline skill
- • "Microservices architecture" (implies generic DevOps)
- • "Strong CS fundamentals" (attracts theorists)
- • Long lists of cloud certifications
- • "ML research experience" (wrong archetype)
The single best signal a job description can send: include a specific scenario in the role description. Example: "You will own the retraining pipeline for our recommendation model, including drift detection and automated quality gates." This self-selects for engineers who have done this work — and deters those who have not.
Why MLOps Hiring Requires a Specialized Recruiter
MLOps is a thin market. There are fewer qualified candidates than for standard ML engineering roles because the combination of infrastructure depth and ML operational knowledge is genuinely rare. Generalist recruiters who post job descriptions and screen resumes for keyword matches will miss the best candidates and waste your time with DevOps engineers who self-report as MLOps.
At VAMI, we have recruited MLOps engineers across financial services, healthcare AI, and consumer tech. We maintain a pre-vetted network and run our own technical screening before any candidate reaches a client interview. Our qualification process is built around the exact skills described in this guide — not generic coding assessments.
Our 98% probation success rate on technical AI hires reflects a sourcing and vetting approach that most internal recruiting teams are not designed to run. When you need your first MLOps engineer — or your fourth — we can have a qualified shortlist in front of you within 3 days.
Talk to VAMI About Your MLOps HireFrequently Asked Questions
Q: What is the difference between a DevOps engineer and an MLOps engineer?
A DevOps engineer manages software deployment pipelines, infrastructure, and service reliability. An MLOps engineer does all of that—but also understands the unique lifecycle of machine learning models: training pipelines, feature stores, experiment tracking, model versioning, and production monitoring for data drift and model degradation. The overlap is real, but a pure DevOps engineer without ML context will struggle with model-specific problems like training/serving skew or the non-determinism of retraining jobs.
Q: Should my MLOps engineer be more platform-focused or model-focused?
It depends on your team's composition and stage. If you have strong ML engineers who can build models but struggle to deploy them reliably, you need a platform-focused MLOps engineer who builds the infrastructure and tooling they use. If your models are in production but quality is degrading and no one can explain why, you need a model-focused MLOps engineer who specializes in monitoring, retraining, and feedback loops. Early-stage teams (first hire) usually benefit more from a generalist who leans platform.
Q: What MLOps tools should a candidate know in 2026?
Core tools: MLflow or Weights & Biases for experiment tracking and model registry, Kubeflow or Vertex AI Pipelines for orchestration, Apache Airflow or Prefect for data pipelines, Ray or Dask for distributed training, and a feature store (Feast, Tecton, or a cloud-native option). Cloud-native fluency (AWS SageMaker, GCP Vertex, or Azure ML) matters more than any single open-source tool. Don't over-index on specific tools—prioritize candidates who understand why the tool ecosystem exists and can adopt new tools quickly.
Q: What's a fair salary for an MLOps engineer in 2026?
In the US, mid-level MLOps engineers typically earn $160K–$220K total compensation (base + equity), with senior engineers at $220K–$300K+ at top-tier companies. In the UK, mid-level ranges from £85K–£130K, senior from £130K–£175K. Remote-first positions outside major tech hubs often offer 10–20% below peak US rates but attract strong talent from Eastern Europe, Latin America, and Southeast Asia. These ranges shift quickly; we recommend treating them as a starting point and validating against current market data.
Q: What are the biggest red flags when interviewing an MLOps engineer?
Four critical red flags: (1) They cannot explain model drift or data drift—this is table stakes for any production MLOps role. (2) They have never deployed a model to production—only built training scripts or contributed to pipelines that someone else deployed. (3) They treat Kubernetes as the solution to every problem—infrastructure skills without ML context produce over-engineered systems that ML engineers can't use. (4) They cannot explain the difference between training-serving skew and model staleness—these require different remediation strategies.
Q: When should I hire a full MLOps team vs. one generalist?
One generalist MLOps engineer is the right call when: you have fewer than 5 ML engineers, you are in early production (under 10 models deployed), or your ML infrastructure is still being established. Move to a team (platform lead + model ops specialist) when: you have 10+ models in production, multiple teams are blocked waiting on shared infrastructure, or model quality issues are consuming more than 20% of your ML engineers' time. A common mistake is hiring a team too early and having them build infrastructure that no one uses.
Ready to Hire an MLOps Engineer Who Understands Models, Not Just Infrastructure?
Use this framework to improve your own hiring process. Or let VAMI run the search — we specialize in exactly this role, and our pre-vetted candidates have already passed the technical screen described above.
Get Vetted MLOps Engineer Candidates