How do you tell the difference between a false positive and a sophisticated attacker?

You do not rely on a single signal. Use corroboration plus step-up verification. If a candidate passes document, liveness, and a controlled recheck while producing consistent assessment artifacts, you treat the original flag as noise and record it as a false positive for tuning.

Should we ever auto-reject on an integrity flag?

Only when your policy defines corroborated conditions that meet an adverse-action standard, and you can produce an Evidence Pack plus two-person signoff. Auto-reject on ambiguous or single-source signals is the fastest way to create wrongful rejections and audit gaps.

What if a candidate refuses step-up verification?

Route to manual review and offer a single alternative path if reasonable (for example, a scheduled live verification with support). If they still refuse, document the refusal and apply the pre-approved policy consistently.

How long should we retain Evidence Packs?

Retain only what you need for audit and dispute resolution, and time-box it. Many teams choose a short retention window (for example, 60-90 days) with role-based access, but your Legal and regulatory requirements should set the final standard.

Assessment-integrity · Jun 20, 2026 · 10 minute read

False Positives in Candidate Fraud Flags: A Defensible Runbook

A practical, audit-ready process for treating a flagged candidate as potentially honest until evidence proves otherwise, without slowing the funnel or increasing risk.

Elena Rostova

IO Psychologist & Assessment Lead

Elena designs fair, predictive coding assessments and calibration frameworks.

A fraud flag is a trigger for controlled resolution, not a verdict. Your defensibility comes from corroboration, documentation, and a consistent appeal path.

Back to all posts

The day a clean candidate gets flagged anyway

The offer is drafted for a senior engineer working on regulated data access. Ten minutes before the final panel, an integrity check flips to "high risk" because the candidate's network looks like a corporate VPN exit node and their webcam feed shows brief compression artifacts. Recruiting wants to keep the slot. Security wants to stop a potential proxy. Legal wants to avoid a defamation-adjacent accusation. Audit wants to know which control made the call and where the evidence is. The fastest path that reduces risk is not a gut decision. It is a predefined step-up path: pause privileged steps, collect a minimal Evidence Pack, run a bounded set of additional checks, and make a documented decision that is consistent across candidates.

Route fraud flags into step-up verification vs manual adjudication with clear triggers.
Issue a consistent candidate message that does not disclose detection methods.
Produce an Evidence Pack that explains the decision without oversharing biometric data.

Why false positives are a security and legal problem, not just a UX issue

False positives create two failure modes: (1) you wrongfully reject strong candidates, and (2) you normalize sloppy decision-making that cannot be defended when a true fraud slips through. Both show up later as audit findings, escalation churn, and brand damage. One data point helps frame why you need a controlled process. Checkr reports that 31% of hiring managers say they have interviewed a candidate who later turned out to be using a false identity. Directionally, that implies identity deception is common enough to warrant controls in many orgs. It does not prove your specific funnel's fraud rate, nor does it specify role type, industry, or the effectiveness of any single detection method. You still need local measurement and adjudication discipline.

What is the documented standard for adverse action based on an integrity flag?
Can you reproduce the decision from logs and artifacts without relying on a single reviewer's memory?
What is the appeal or second-look mechanism, and how do you prevent bias from entering it?

Ownership, automation, and systems of truth

Put this in writing before you tune models or add more signals. When ownership is vague, reviewers improvise, and that is where false positives turn into inconsistent treatment. Recommended operating model: Recruiting Ops owns the workflow and SLAs, Security owns control design and monitoring, the Hiring Manager owns job-relevant assessment outcomes, and Legal or Compliance reviews the policy language and adverse-action thresholds. Automation should gate and route, not convict. Systems of truth should be explicit. The ATS is the system of record for status and decisions. The verification service is the system of record for identity and liveness outcomes. The interview and coding platforms are the systems of record for performance artifacts. Everything writes back to the ATS with immutable timestamps and reviewer identity.

Automated: risk scoring, step-up verification prompts, throttling repeated attempts, Evidence Pack assembly.
Manual: adjudication for high risk, appeal reviews, exceptions for accessibility or documented constraints.

What is a fraud flag in hiring workflows?

A fraud flag is not a single event. It is a classification produced from one or more integrity signals that indicate elevated likelihood of identity mismatch, proxy participation, or policy circumvention. Your governance task is to define what the flag can and cannot do in the funnel. The control objective is simple: do not allow unverified or high-uncertainty identities into privileged steps (live technical interviews, access to proprietary codebases, or offer stages) without either resolving uncertainty or logging an approved exception.

Flags should trigger least-cost resolution first (step-up), then human adjudication.
Candidates should not be told which signal triggered the flag.
Decisions must be explainable without exposing sensitive detection methods.

A step-by-step runbook for handling false positives

Use a three-lane model that keeps the funnel moving while controlling risk. Start with the conclusion: most flags should not cause rejection. They should cause step-up verification or a short manual review with a standardized evidence template. Step 1: Freeze privileged progression, not the entire candidacy. Move the candidate to a temporary "Integrity Review" stage in the ATS with an SLA (for example, same business day) and a clear internal owner. Step 2: Confirm the basics automatically. Check for low-cost inconsistencies: document match result, face match score banding, liveness pass/fail, and whether the same identity was verified earlier in the pipeline. Record outcomes, not raw biometrics. Step 3: Run step-up verification for medium risk. Examples: a second liveness check, a short voice phrase, or a re-capture under different lighting. Keep it bounded to 2-3 minutes so you do not create candidate drop-off. Step 4: Build an Evidence Pack. Attach time-stamped events, decision points, and reviewer notes. Include what was observed and which policy threshold was met. Exclude raw biometric payloads if your posture is zero-retention biometrics or limited retention. Step 5: Manual adjudication for high risk or unresolved step-up. Require two-person integrity review (Ops plus Security or Compliance) for adverse decisions. This is your control against single-reviewer bias and overconfidence. Step 6: Offer a controlled appeal path. Provide a neutral message and allow a single re-attempt under stricter verification, unless you have strong corroborating evidence of deliberate deception. Step 7: Close the loop. Tag the case as "false positive" or "confirmed fraud" and feed it back into tuning. Track reviewer fatigue signals like queue age and rework rate to prevent drift into blanket rejections.

Only step-up when the flag is based on a single ambiguous signal (for example, poor lighting causing liveness uncertainty).
Require corroboration for adverse action (for example, identity mismatch plus repeated failed liveness attempts plus inconsistent interview behavior).
Use time-boxed SLAs so reviews do not silently become de facto rejections.

Use neutral language: "We need to complete an additional verification step before proceeding."
Do not mention fraud, cheating, VPNs, or specific detection methods.
Document the exact message template used and when it was sent.

Policy config: false-positive handling and appeal controls

Below is an example policy-as-config artifact that Security, GC, and Recruiting Ops can jointly approve. It encodes routing, SLAs, Evidence Pack requirements, and an appeal path without embedding sensitive model internals.

Anti-patterns that make fraud worse

These three patterns increase legal risk and make attackers more effective by teaching them your thresholds:

Zero-tolerance auto-reject on any single flag, which guarantees false positives and trains candidates to mask behavior rather than comply.
Unstructured reviewer discretion in Slack, where decisions are inconsistent and the record is not audit-grade.
Over-sharing flag reasons with candidates, which leaks detection methods and drives adversarial adaptation.

How to measure false positives without inventing ROI

Measure process quality, not vanity conversion. Start with operational metrics you can defend: time-to-resolution for integrity reviews, percent of flags resolved by step-up verification, appeal rate, and adjudication overturn rate. Then add risk signals: number of hires with unresolved identity uncertainty, and counts of repeat attempts per candidate. If you need cost framing, keep it conservative and sourced. SHRM notes replacement cost estimates can range from 50% to 200% of annual salary depending on role. Directionally, that implies a bad hire is expensive enough to justify controls. It does not prove your org will hit the high end, and it does not attribute replacement cost to fraud specifically. Use it as a budgeting context, not a promised savings number.

Policy version in force and last approval date (Security and GC).
Counts by lane: cleared, step-up cleared, manual review cleared, adverse action.
Median and P95 time-to-resolution for integrity reviews.
Exception log with approver and justification.

Where IntegrityLens fits

IntegrityLens AI is the first hiring pipeline that combines a full Applicant Tracking System with advanced biometric identity verification, AI screening interviews, and technical assessments so you can manage the entire lifecycle in one secure platform. For false positives, it matters because you can route flags inside the ATS workflow, step-up verify in under 3 minutes (typical end-to-end document plus voice plus face), and attach Evidence Packs to each decision for audit. TA leaders and recruiting ops use it to keep the funnel moving; CISOs use it to enforce Risk-Tiered Verification and access controls; Legal and Audit get consistent, exportable records. Key capabilities tied to this runbook:

ATS workflow with gated stages and clear status transitions.
Biometric identity verification with step-up paths and Zero-Retention Biometrics options.
Fraud detection signals that drive routing, not silent rejections.
24/7 AI screening interviews and coding assessments (40+ languages) with defensible logs.
Integration hooks like Idempotent Webhooks for write-back and audit trails.

Sources

31% of hiring managers report interviewing someone later found using a false identity (Checkr, 2025): https://checkr.com/resources/articles/hiring-hoax-manager-survey-2025 1 in 6 applicants to remote roles showed signs of fraud (Pindrop): https://www.pindrop.com/article/why-your-hiring-process-now-cybersecurity-vulnerability/ Replacement cost estimates range 50% to 200% of annual salary depending on role (SHRM): https://www.shrm.org/in/topics-tools/news/blogs/why-ignoring-exit-data-is-costing-you-talent

Treat them as directional context for control design, not as predictive rates for your funnel.
Validate locally by instrumenting flags, outcomes, and adjudication quality over time.

Related Resources

Key takeaways

Treat fraud flags as risk signals, not verdicts, and require corroboration before adverse action.
Design an adjudication lane with clear owners, SLAs, and an appeal path to reduce legal and reputational risk.
Use step-up verification and Evidence Packs to resolve uncertainty fast without leaking attacker feedback.
Measure false positives and reviewer fatigue as first-class funnel health metrics, not anecdotal pain.

False-positive adjudication policy (Risk-Tiered Verification)yaml

A policy-as-code template that routes fraud flags into step-up verification or manual adjudication with SLAs, Evidence Pack requirements, and an appeal mechanism.

Designed for CISO and GC sign-off: minimal disclosure, clear adverse-action thresholds, and audit-ready decision logging.

version: "2026-06-20"
policy_name: "false-positive-handling"
system_of_record:
  candidate_status: "ATS"
  identity_results: "IntegrityLens-Verify"
  assessment_artifacts: "IntegrityLens-Assess"

slas:
  integrity_review:
    target_hours: 8
    max_hours: 24

risk_routing:
  # Inputs are normalized signals, not raw biometrics.
  inputs:
    - name: "id_doc_match"
      values: ["pass", "fail", "inconclusive"]
    - name: "liveness"
      values: ["pass", "fail", "inconclusive"]
    - name: "face_match_band"
      values: ["high", "medium", "low", "unknown"]
    - name: "attempt_count"
      type: "integer"
    - name: "assessment_integrity"
      values: ["clean", "suspicious", "blocked"]

  lanes:
    - name: "clear"
      when:
        all:
          - { id_doc_match: "pass" }
          - { liveness: "pass" }
      action:
        ats_stage: "Proceed"
        log_event: "integrity.cleared"

    - name: "step_up"
      when:
        any:
          - { liveness: "inconclusive" }
          - { face_match_band: "unknown" }
      guardrails:
        max_attempts: 2
        candidate_message_template: "additional-verification-neutral-v1"
      action:
        ats_stage: "Integrity Review"
        require_step_up:
          - "liveness_recheck"
          - "voice_phrase"
        evidence_pack_required: true
        log_event: "integrity.step_up_requested"

    - name: "manual_adjudication"
      when:
        any:
          - { id_doc_match: "fail" }
          - { liveness: "fail" }
          - { attempt_count: ">=3" }
          - { assessment_integrity: "blocked" }
      action:
        ats_stage: "Integrity Review"
        reviewer_quorum: 2
        reviewers:
          - "RecruitingOps"
          - "SecurityOrCompliance"
        evidence_pack_required: true
        log_event: "integrity.manual_review_required"

adverse_action:
  # Do not reject solely on a single ambiguous signal.
  allowed_when:
    all:
      - any:
          - { id_doc_match: "fail" }
          - { liveness: "fail" }
      - any:
          - { assessment_integrity: "blocked" }
          - { attempt_count: ">=3" }
  require:
    - "evidence_pack"
    - "two_reviewer_signoff"
    - "candidate_notice_sent"

appeal_flow:
  enabled: true
  max_appeals: 1
  appeal_requires:
    - "fresh_step_up_verification"
    - "new_reviewer_not_in_initial_decision"
  log_event: "integrity.appeal_opened"

data_controls:
  retention_days:
    evidence_pack: 90
    raw_biometrics: 0
  access:
    role_based_access_control: true
    export_requires: "SecurityApproval"
  encryption: "AES-256"

Outcome proof: What changes

Before

Flags were handled ad hoc in email and chat, leading to inconsistent decisions, slow escalations, and weak audit trails for adverse actions.

After

Integrity flags were routed into step-up verification or manual adjudication with a standard Evidence Pack and an appeal path, all tied to ATS stage transitions.

Governance Notes: Legal and Security signed off because the process minimizes data collection (including a zero-retention option for raw biometrics), enforces role-based access controls, time-boxes retention for Evidence Packs, and documents an appeal flow with separation of duties. Decisions are logged back to the ATS as the system of record, limiting informal channels and making adverse action standards consistent.

Implementation checklist

Define a single source of truth for candidate status (ATS) and write back every decision with timestamps.
Create a two-tier policy: automated step-up for medium risk, manual adjudication for high risk.
Standardize an Evidence Pack template for every flagged case (what, when, who, and why).
Implement an appeal flow with a second reviewer and documented rationale.
Tune controls to minimize candidate harm: least-privilege data access, short retention, and consistent messaging.

Questions we hear from teams

How do you tell the difference between a false positive and a sophisticated attacker?: You do not rely on a single signal. Use corroboration plus step-up verification. If a candidate passes document, liveness, and a controlled recheck while producing consistent assessment artifacts, you treat the original flag as noise and record it as a false positive for tuning.
Should we ever auto-reject on an integrity flag?: Only when your policy defines corroborated conditions that meet an adverse-action standard, and you can produce an Evidence Pack plus two-person signoff. Auto-reject on ambiguous or single-source signals is the fastest way to create wrongful rejections and audit gaps.
What if a candidate refuses step-up verification?: Route to manual review and offer a single alternative path if reasonable (for example, a scheduled live verification with support). If they still refuse, document the refusal and apply the pre-approved policy consistently.
How long should we retain Evidence Packs?: Retain only what you need for audit and dispute resolution, and time-box it. Many teams choose a short retention window (for example, 60-90 days) with role-based access, but your Legal and regulatory requirements should set the final standard.

Ready to secure your hiring pipeline?

Let IntegrityLens help you verify identity, stop proxy interviews, and standardize screening from first touch to final offer.

Try it free Book a demo

Watch IntegrityLens in action

See how IntegrityLens verifies identity, detects proxy interviewing, and standardizes screening with AI interviews and coding assessments.