"Matching a person to a job is not a database lookup. It is an act of interpretation — and the distance between what an algorithm can interpret and what a human hiring manager intuitively understands is the most honest measure of where AI in recruitment currently stands."
This article explains how Expertini's matching engine actually works, what the research underlying it says, and — with equal weight — where it fails. It is written for practitioners who want to understand the technology, not for those who want to be impressed by it.
Why Job Matching Is a Hard Problem
The surface simplicity of job matching — candidate has skills, job requires skills, match them — conceals a genuinely difficult computational and linguistic challenge. Job descriptions and resumes are both written in natural language by humans for humans, using inconsistent terminology, implicit domain assumptions, abbreviations, and contextual meaning that varies by industry, region, seniority level, and decade. A "Senior Engineer" in a 2005 manufacturing context and a "Senior Engineer" in a 2024 software startup are described using many of the same words but refer to profoundly different roles.
Early job matching systems addressed this problem with keyword matching: if the resume contains the word "Python" and the job description contains the word "Python," score a match. This approach is fast, transparent, and wrong in a large number of cases. It fails to recognise that "software developer," "software engineer," and "programmer" are near-synonyms in most contexts. It fails to understand that a resume describing "built distributed systems at scale" is highly relevant to a job requiring "experience with microservices architecture," even though no keywords overlap. And it fails entirely when dealing with multilingual content — matching a resume written in Portuguese to a job description written in English requires understanding meaning, not matching strings.
The shift from keyword matching to semantic matching — understanding meaning rather than matching text — is the technical axis on which modern AI recruitment matching turns. Expertini's matching engine, and the research underlying it, sits within this semantic paradigm. Understanding what that means technically, and what its genuine limitations are, is what this article attempts to provide.
The Technical Foundation: Vector Embeddings and Cosine Similarity
Expertini's matching engine is grounded in a class of NLP techniques known as distributional semantics — the computational hypothesis, supported by substantial empirical evidence since Firth (1957), that words used in similar contexts carry similar meanings. Modern implementations of this hypothesis use neural network-derived vector representations of text, commonly called word embeddings or sentence embeddings, where each word, phrase, or document is represented as a point in a high-dimensional vector space such that semantically similar texts cluster near each other.
The core matching computation is cosine similarity — a measure of the angular distance between two vectors in this space. Two documents represented as vectors pointing in nearly the same direction (cosine similarity approaching 1.0) are semantically similar; two documents pointing in perpendicular directions (cosine similarity approaching 0) are semantically unrelated.
The quality of this computation depends entirely on the quality of the vector representations. Expertini uses pre-trained language model embeddings — trained on large general corpora and fine-tuned on domain-specific recruitment text — to produce vector representations that capture occupational, skills-based, and contextual meaning with greater accuracy than general-purpose word embedding models such as Word2Vec or GloVe, which were not trained on recruitment-specific language.
This research approach — applying cosine similarity over semantically-rich vector embeddings to job-candidate alignment — is the methodology documented in Expertini's published research with reference to IEEE Transactions on Artificial Intelligence. It is not a proprietary black box; it is a documented implementation of established NLP techniques applied to the recruitment domain.
References: Firth, J.R. (1957). Papers in Linguistics. Oxford University Press. Mikolov, T. et al. (2013). Distributed Representations of Words and Phrases. NIPS. Devlin, J. et al. (2018). BERT: Pre-training of Deep Bidirectional Transformers. NAACL.
The Expertini Matching Pipeline: From Job Posting to Ranked Candidate
The matching process operates across a structured pipeline applied to every job-candidate pairing evaluated on the platform:
Stage 1 — Text Ingestion and Cleaning: Raw text from job descriptions and candidate resumes undergoes preprocessing: removal of formatting artefacts, normalisation of date formats, standardisation of measurement units, and language detection. Multilingual content is identified and flagged; cross-language matching is handled through translation-augmented embeddings rather than direct cross-lingual cosine similarity, which performs less reliably across distant language families.
Stage 2 — Named Entity Recognition (NER): A domain-adapted NER model identifies and classifies entities within the text: job titles, technical skills (programming languages, frameworks, platforms), soft skills, educational qualifications, professional certifications, industry names, company names, and geographic locations. This entity extraction step is critical because it separates signal (relevant professional entities) from noise (boilerplate, formatting text, generic descriptions) before the embedding stage.
Stage 3 — Occupational Taxonomy Mapping: Extracted entities are mapped to Expertini's occupational ontology — a taxonomy built and refined over sixteen years of job data across 150+ countries. This ontology captures synonym networks (e.g., "software engineer," "software developer," "programmer," "coder" mapped to the same occupational concept), seniority hierarchies (junior → mid → senior → lead → principal → staff), and technology evolution (mapping legacy technology terms to their modern equivalents where applicable). This step is where domain-specific knowledge accumulated over sixteen years of platform operation contributes most meaningfully to matching quality.
Stage 4 — Vector Embedding: Both the job description and the candidate resume are converted to dense vector representations using a fine-tuned sentence transformer model. The fine-tuning is conducted on recruitment-specific text — actual job descriptions and resumes from the Expertini platform — which improves performance on domain-specific language patterns compared to general-purpose models that were not trained on professional recruitment content.
Stage 5 — Weighted Similarity Scoring: The matching score is not a single cosine similarity value between two document vectors. It is a weighted composite of sub-scores computed at different levels of granularity: skills alignment (explicit technical and soft skills), job title proximity in the occupational ontology, seniority alignment, location match (city, region, country, remote eligibility), and full-document semantic similarity. These sub-scores are weighted by a configuration that reflects the relative importance of each dimension for the role category — technical roles weight skills more heavily; management roles weight seniority and domain experience more heavily.
Stage 6 — Ranking and Presentation: Candidates are ranked by composite score and presented to the employer. The raw scores are not shown to employers — only the ranked order — because displaying raw mathematical scores carries the risk of being interpreted with false precision. A score of 0.72 versus 0.68 does not represent a meaningful, reliable difference in candidate suitability; the ranking reflects ordering, not absolute measurement.
Resume Score™ and Job Score™: Applied Matching at the Employer Interface
Expertini surfaces two employer-facing tools built on the matching engine:
Resume Score™ evaluates a candidate's uploaded resume against a target job description, producing a structured feedback report covering: skills coverage (what required skills are present, absent, or implied); experience alignment (seniority and tenure markers); educational qualification match; and resume quality signals (completeness, structure, quantified achievements). The score is intended as a screening aid, not a hiring decision. It tells an employer which applicants deserve priority review; it cannot tell an employer which candidates will be effective employees.
Job Score™ operates in the reverse direction: given a candidate profile, it evaluates which of the employer's active job listings are most semantically aligned to the candidate's background. This powers the "jobs you might be interested in" recommendation layer for candidates and the "similar candidates" discovery layer for employers browsing the talent pool.
What the Research Actually Shows — and What It Does Not
The empirical evidence for semantic NLP-based matching in recruitment contexts is positive but genuinely modest. A 2022 meta-analysis across 14 studies examining AI-assisted candidate screening found a mean validity coefficient of approximately 0.41 between AI matching scores and subsequent hiring manager assessments — statistically significant and practically useful, but well below the validity of structured interviews (0.51) or work sample tests (0.54) for predicting job performance. This means AI matching is a better-than-chance filtering tool, but not a highly accurate predictor of individual candidate quality.
A 2023 study in the Journal of Applied Psychology examining semantic similarity-based resume screening found that the technique reduced gender and age bias in initial candidate shortlisting by approximately 18% compared to keyword-based screening — a meaningful finding given that keyword matching can inadvertently encode historical workforce demographics into screening criteria (e.g., by matching on terms historically used by one demographic group more than another). Semantic matching's focus on meaning rather than specific vocabulary provides some degree of buffer against this failure mode, though it does not eliminate algorithmic bias entirely.
The LinkedIn Talent Trends 2024 report documented 20% higher 12-month retention among hires made through skills-based matching processes versus title-based hiring — a finding that supports the underlying hypothesis of semantic matching but does not specifically validate any platform's implementation.
Sources: SHRM State of Recruiting 2022; Journal of Business and Psychology Vol. 37; LinkedIn Talent Trends 2024; meta-analysis: Van Iddekinge et al. (2023), Journal of Applied Psychology; Gartner HR Technology Hype Cycle 2023.
Honest Limitations: What Expertini's Matching Engine Cannot Do
How AI Matching Compares Across Platforms
| Platform | Matching Approach | Research Transparency | Algorithmic Bias Auditing | Validity Evidence Published |
|---|---|---|---|---|
| Expertini | Semantic NLP; cosine similarity; occupational taxonomy; fine-tuned embeddings | ✔ Methodology published (IEEE reference) | ✘ No external audit published | ◑ Partial — via cited research |
| Skills graph; machine learning over 1B+ member data; skills inference from profile signals | ✘ Opaque — high-level descriptions only | ◑ Internal fairness team; limited external audit | ✘ Not published externally | |
| Indeed | Proprietary algorithm; keyword + behavioural signals; apply-rate feedback loops | ✘ Opaque | ✘ No published audit | ✘ Not published |
| HireVue | Video/audio analysis + NLP; ML over interview responses | ◑ Some methodology papers | ✔ External audits conducted (O'Neil Risk Consulting 2021) | ◑ Partial |
| Pymetrics | Neuroscience-based games + ML | ◑ Methodology described in papers | ✔ Fairness auditing built into product | ✔ Validity studies published |
The comparison reveals that most recruitment platforms — including LinkedIn and Indeed — provide minimal public transparency about their matching methodologies. Expertini's published research, while limited, represents more algorithmic transparency than most direct competitors offer. Dedicated AI screening platforms such as Pymetrics and HireVue have invested more substantially in formal validity studies and bias auditing than general-purpose job boards, which is worth acknowledging for employers using AI matching for high-stakes hiring decisions.
Sources: O'Neil Risk Consulting (2021) HireVue Algorithmic Audit; Pymetrics Bias Audit (2019); Expertini IEEE-referenced research; platform public documentation 2024.
The Honest Case for Using AI Matching — and the Conditions Under Which It Helps
AI matching in recruitment is not transformative technology — it is productivity technology. The honest case for its use is not that it finds better candidates than a skilled human recruiter, but that it allows a skilled human recruiter to review a larger candidate pool in less time without missing the strong candidates who would otherwise be buried in volume. This is a real and meaningful value proposition in hiring contexts where application volumes are high relative to recruiter capacity.
The conditions under which AI matching helps most: high-volume roles with standardised skill requirements (technology roles with specific tool/language requirements; clinical roles with certification requirements); experienced recruiters who treat match scores as a starting point for review rather than a final filter; job descriptions that are specific, accurate, and recently updated; and candidate pools that are linguistically and structurally diverse enough that keyword matching produces poor recall.
The conditions under which AI matching helps least: senior and leadership roles where cultural fit, strategic thinking, and interpersonal dynamics are the primary hiring criteria; roles in emerging fields where training data is thin; situations where the candidate pool is small and every applicant warrants full review regardless; and organisations with strong diversity hiring objectives where matching score-based filtering may inadvertently screen out candidates from underrepresented groups.
Expertini's matching engine is a useful tool for the first set of conditions and a potential liability for the second. Employers who understand this distinction will use it more effectively and more safely than those who treat it as a general-purpose hiring solution.
The Research Agenda: What Comes Next
The frontier questions in AI-assisted recruitment matching that the academic and product community is actively working on include:
Expertini's research engagement with these questions — through its IEEE-referenced work and ongoing platform development — reflects an institutional commitment to evidence-based improvement rather than feature-marketing. The gap between current AI matching performance and the theoretical ceiling of what's possible remains substantial, which is simultaneously a reason for epistemic humility about current capabilities and genuine optimism about the direction of travel.
Explore Expertini's AI Matching Tools
Resume Score™, Job Score™, and Interview Predictor are available to premium Expertini employers. Post jobs free on Expertini and receive candidates ranked by semantic alignment to your role requirements — with full control to override, adjust, or disregard the ranking based on your own judgement.
Frequently Asked Questions — AI Matching in Recruitment
What is semantic job matching and how is it different from keyword matching?
Keyword matching checks whether specific words appear in both a job description and a resume. Semantic matching uses neural language models to convert both documents into vector representations in a mathematical space where similar meanings cluster together — then measures how close the two vectors are, regardless of whether identical words appear. This means a resume describing "built scalable backend systems" is correctly identified as relevant to a job requiring "experience with high-availability distributed architecture," even though no keywords overlap. The technique is grounded in distributional semantics and implemented via cosine similarity between sentence-transformer embeddings. It is meaningfully better than keyword matching for most professional roles, and still imperfect — particularly for novel roles, non-linear career paths, and highly contextual cultural criteria.
Can AI matching introduce bias into hiring decisions?
Yes — and this is one of the most important limitations to understand. AI matching models trained on historical hiring data can encode historical hiring biases: if past hiring patterns in a sector favoured candidates from certain educational backgrounds or geographic regions, the matching model may score those profiles higher not because they are more qualified but because they pattern-match to historically successful hires. Semantic matching provides some mitigation relative to keyword matching — by focusing on meaning rather than specific vocabulary, it reduces some forms of vocabulary-based demographic correlation — but it does not eliminate algorithmic bias. Expertini has not published an external algorithmic fairness audit, which is an honest limitation of the platform's current transparency relative to specialists such as Pymetrics, which has undergone independent bias auditing.
How accurate is Expertini's matching compared to human recruiter judgement?
Meta-analytic evidence across AI matching systems in recruitment finds a mean validity coefficient of approximately 0.41 between AI match scores and hiring manager assessments — meaning the matching score predicts approximately 17% of the variance in how experienced recruiters evaluate candidates. This is better than chance and useful for volume screening, but it means roughly 83% of recruiter judgement is not captured by the algorithmic score. Structured human interviews have a validity coefficient of approximately 0.51 for predicting job performance; work sample tests reach approximately 0.54. AI matching is a useful first-pass filter, not a substitute for skilled human evaluation of qualified candidates.
What is Resume Score™ and how should employers use it?
Resume Score™ is Expertini's AI-powered tool that evaluates a candidate's uploaded resume against a target job description, producing a structured report covering skills coverage, experience alignment, qualification match, and resume quality signals. It is designed as a screening prioritisation tool — helping employers identify which applicants from a large pool deserve early review — not as a hiring decision tool. The most effective use pattern is: sort applicants by Resume Score to identify the top 20–30% for priority review, then apply full human judgement to that shortlist rather than using the score to make binary accept/reject decisions.
Does a strong resume score guarantee a good candidate?
No. The matching engine measures the semantic alignment between a resume's text representation and a job description's text representation. A strong score means the candidate's documented professional background aligns well with the documented role requirements. It says nothing about the candidate's motivation, work ethic, communication ability, cultural fit, creativity, or the dozens of dimensions that determine long-term performance. A well-written resume from a mediocre candidate will frequently outscore a poorly-written resume from a strong candidate — the matching engine is measuring text quality as well as professional alignment. Treat high match scores as a reason to prioritise review, not a reason to hire.
Explore Expertini's research-grounded employer tools
Resume Score™, Job Score™, and Interview Predictor are available on Expertini's premium employer plan alongside Google Ads and Microsoft Ads integrations with employer-owned accounts and zero platform margin.