What is the difference between an NLP engineer and an LLM engineer?

Classical NLP engineers build systems for structured text understanding: information extraction, named entity recognition, document classification, text normalization, and rule-based or small-model approaches using libraries like spaCy and NLTK. LLM engineers build on top of foundation models: fine-tuning, RAG architectures, prompt systems, and evaluation frameworks for generative outputs. Post-2023, most new 'NLP' roles are actually LLM engineering roles — but the skills are genuinely different, and conflating them produces mismatched hires.

What salary should I pay an NLP engineer in 2026?

NLP engineer salaries in the US range from $160K to $270K base depending on seniority and specialisation. Classical NLP engineers with production experience typically earn $160K–$210K. LLM-focused NLP engineers with fine-tuning and RAG experience command $180K–$250K. Senior engineers at AI labs or companies with large-scale language systems earn $220K–$270K+. UK equivalents: £85K–£150K. LLM engineering commands a premium of 10–20% over classical NLP at equivalent seniority.

Where do I find strong NLP engineers?

The strongest NLP candidates are found through academic conference networks (ACL, EMNLP, NAACL), Hugging Face model hub contributor lists, arXiv NLP papers (cs.CL section), and GitHub contributors to NLP libraries (spaCy, Transformers, LlamaIndex, LangChain). LinkedIn is less effective for senior NLP roles — most strong candidates are passive and require direct technical outreach. Kaggle NLP competitions can surface strong candidates at mid-level, particularly for classification and extraction tasks.

What are the red flags when interviewing NLP engineers?

Key red flags: engineers who cannot discuss trade-offs between rule-based and model-based approaches for a specific problem; candidates who treat all NLP problems as fine-tuning problems without considering simpler solutions; no experience with evaluation metrics beyond accuracy (strong NLP engineers know BLEU, ROUGE, BERTScore, and their limitations); and engineers who have only worked with English text without understanding the challenges of multilingual or domain-specific corpora.

NLP Engineer Hiring Guide: Skills, Salary, and How to Screen Candidates

Before 2023, "NLP engineer" had a reasonably consistent meaning: someone who built text processing systems using a combination of rule-based approaches, classical ML, and pretrained language models like BERT. The rise of GPT-4 and the LLM ecosystem fractured that definition into two distinct roles that require different skills, attract different candidates, and need different interview processes.

Companies that post a generic "NLP Engineer" job description in 2026 get CVs from both populations — and then struggle to evaluate which type they actually need. The result is either a mis-hire or a prolonged search that filters out strong candidates for the wrong reasons.

This guide covers the two NLP specialisations, what each requires, how to write the job description correctly, and how to run the interview process for each.

The Two NLP Specialisations in 2026

Classical NLP Engineer

Classical NLP engineers build systems for structured text understanding — extracting information from text at scale, classifying documents, normalising and cleaning text data, and building pipelines that are reliable and maintainable without depending on expensive LLM APIs for every operation.

Core work includes:

Named entity recognition (NER) and relation extraction
Document classification and intent detection at scale
Text normalisation, deduplication, and data cleaning pipelines
Information extraction from semi-structured text (contracts, reports, forms)
Search and retrieval systems using sparse (BM25) and dense (embedding-based) approaches
Multilingual text processing pipelines

Core tools: spaCy, NLTK, Hugging Face Transformers (fine-tuning encoder models like BERT, RoBERTa), Elasticsearch, custom training pipelines.
Salary (US, senior): $160K–$220K base

LLM Engineer (NLP-focused)

LLM engineers build systems on top of foundation models — designing the architecture that determines how language models are prompted, retrieved from, fine-tuned, evaluated, and deployed in production. This is the more common "NLP" hire in 2026, though it is better described as LLM engineering.

Core work includes:

RAG system design: chunking strategies, embedding models, retrieval evaluation
Fine-tuning foundation models on domain-specific data (LoRA, QLoRA, full fine-tuning)
Prompt system engineering and systematic evaluation at scale
RLHF and preference data collection pipelines
LLM output evaluation frameworks: automated metrics, human eval, model-based eval
Latency optimisation for generative inference (streaming, caching, batching)

Core tools: OpenAI/Anthropic APIs, LangChain, LlamaIndex, vLLM, TGI, PEFT, Weights & Biases.
Salary (US, senior): $180K–$260K base

Skills Both Types Share

Despite the split, strong NLP engineers of either type share a common foundation:

Tokenisation and embeddings: Deep understanding of how text is represented numerically — subword tokenisation, embedding spaces, semantic similarity
Evaluation fluency: Knowing when BLEU, ROUGE, BERTScore, or task-specific metrics apply — and crucially, knowing their limitations
Production mindset: Awareness of latency, throughput, and cost constraints — the difference between a model that works in a notebook and one that serves 1M requests per day
Data quality instincts: Text data is messy; strong NLP engineers spend more time on data cleaning and annotation quality than on model selection

How to Write the Job Description

The most common mistake is writing a job description that lists requirements from both types: "experience with spaCy, BERT fine-tuning, RAG pipelines, RLHF, and production NLP systems." This is two different roles in one posting. It attracts no one strong because strong candidates in either specialisation will see requirements they do not match.

Classical NLP JD — what to include

What you will do: Build and maintain text extraction and classification systems; design annotation pipelines; own model performance in production
What we look for: Proficiency with spaCy or similar; experience fine-tuning encoder models; strong data pipeline engineering; production experience at scale; evaluation methodology beyond accuracy
What we are not looking for: LLM API experience is a bonus, not a requirement

LLM Engineering JD — what to include

What you will do: Design and implement RAG pipelines; build evaluation frameworks for generative outputs; fine-tune models on domain data; own LLM integration architecture
What we look for: Hands-on RAG system experience; fine-tuning with PEFT methods; LLM evaluation frameworks; production latency and cost optimisation experience
What we are not looking for: Classic ML or rule-based NLP experience is not required

The Interview Framework

For Classical NLP Engineers

Text pipeline design: "We need to extract all financial figures and the entities they refer to from 10,000 earnings call transcripts per day. Walk me through your approach." Evaluate: do they consider rule-based vs model-based trade-offs? Do they think about annotation budget and quality?
Evaluation strategy: "You have built an NER model. How do you know it is good enough to ship?" Look for: span-level precision/recall, error analysis by entity type, production monitoring approach
Production scenario: "Your document classifier performs well in offline evaluation but has high error rates on a specific document type in production. How do you debug this?"

For LLM Engineers

RAG system design: "Design a RAG system for a legal firm that needs to query 500,000 case documents. What chunking strategy, embedding model, and retrieval approach do you use, and how do you evaluate it?"
Evaluation framework: "How do you measure whether a fine-tuned model is better than a prompted baseline for a document summarisation task?" Look for: automated metrics, human eval design, LLM-as-judge trade-offs
Production optimisation: "Our LLM endpoint is costing $40K per month. Walk me through how you would reduce this by 50% without unacceptable quality degradation."

Red Flags in Candidates

Cannot discuss trade-offs between approaches. Strong NLP engineers know when a regex or a small fine-tuned model beats a GPT-4 call — and can explain why. Engineers who default to the most complex solution for every problem create expensive, fragile systems.
Evaluation blind spots. Candidates who report accuracy without discussing class imbalance, or who evaluate generative models only on ROUGE scores, have not worked on production NLP systems where real quality judgements are harder.
No production experience. NLP is full of researchers who can train models but have never dealt with inference latency, input distribution shift, or the annotation pipeline required to improve a model after deployment.
English-only experience. For products with international users, engineers who have never worked with multilingual corpora underestimate the complexity significantly.

Where to Find NLP Engineers

Academic conferences: ACL, EMNLP, NAACL, EACL author lists — particularly authors from industry labs or applied research teams
Hugging Face: Model hub contributors and authors of popular models in the NLP category; frequent community forum contributors
GitHub: Contributors to spaCy, Transformers, LlamaIndex, LangChain, Haystack
arXiv cs.CL: Regular posters with industry affiliations
Kaggle: Top performers in NLP competitions with detailed solution write-ups

Salary Benchmarks (2026, US market)

Classical NLP Engineer (Senior): $160K–$220K base
LLM Engineer / NLP-focused (Senior): $180K–$260K base
Staff / Principal NLP Engineer: $230K–$300K+ base
UK equivalents: Classical NLP £85K–£130K; LLM Engineering £100K–£150K

Working with VAMI on NLP Hiring

VAMI places NLP engineers and LLM engineers across fintech, legaltech, healthcare AI, and enterprise SaaS. Before sourcing, we clarify which of the two specialisations your role actually requires — so you get candidates matched to what you need. See our guides on the LLM engineer vs ML engineer comparison and generative AI engineer skills for additional context.

If you are hiring an NLP engineer and want to reach candidates outside job boards, talk to VAMI about your search.

Summary

NLP engineering has split into classical NLP (extraction, classification, document processing) and LLM engineering (RAG, fine-tuning, generative systems) — these are different roles
Writing a JD that mixes both attracts no one strong; be specific about which type you need
Both types share: tokenisation/embedding knowledge, evaluation fluency, production mindset, data quality instincts
Interview framework differs by type — classical NLP focuses on pipeline design and evaluation; LLM engineering focuses on RAG architecture and generation quality
Best sourcing: ACL/EMNLP/NAACL proceedings, Hugging Face contributor lists, arXiv cs.CL, GitHub library contributors
Salary: $160K–$260K+ base in the US; LLM engineering commands 10–20% premium over classical NLP