Hiring may seem like a simple act - evaluate a candidate, match them to a role, and extend an offer. But underneath this human-led process is a set of decisions that are remarkably hard to automate. We’re not just matching resumes to job descriptions; we’re asking a fundamentally human question: Who can I trust to build or sell?
At Spottable, we’re building AI agents that don’t just read or listen - they observe, infer, simulate, and collaborate. These are not passive systems generating plausible sentences. They are active agents trained to solve one of the most consequential problems in business: hiring the right people without relying on brittle interviews or noisy proxies.
But there’s a catch.
Current LLMs, even the most advanced ones, hit a ceiling when it comes to solving this. And that ceiling isn’t just about tokens, compute, or clever prompting. It’s about the limits of passive intelligence.
Limits
LLMs are probabilistic next-word engines trained on human-generated data. They excel at synthesis, translation, and contextual conversation. But they make several assumptions that break down in real-world hiring:
Static
Hiring is not a frozen document. It’s an evolving context where candidates react, companies shift direction, and expectations change mid-process. LLMs, by default, don’t adapt to interactional drift - the subtle shifts in tone, intent, and priority that define a high-stakes decision like hiring.
Coherence
Candidates can sound great on paper - or in a generated interview answer. But sounding coherent isn’t the same as being competent. Traditional LLM evaluations reward linguistic fluency, not situational reasoning. Hiring demands the ability to assess judgment, reaction, and decision-making under ambiguity.
Agency
LLMs are not agents. They do not choose what to observe next. They don’t explore, revise, or take actions based on feedback loops unless embedded into a more complex system. Real hiring requires back-and-forth interaction, environmental feedback, and behavioral cues - none of which are handled well by frozen, stateless models.
Difficulty
From a systems perspective, hiring is a sequential, multi-modal, multi-agent decision problem.
- It involves latent variables (true ability, hidden intentions).
- It requires sparse signals (a few minutes of demo work, one recorded answer).
- It is judged on long-term outcomes (will this person perform well in 6 months?).
- It demands intervention-aware modeling (how the hiring process itself changes the candidate's behavior).
This makes it a Partially Observable Markov Decision Process (POMDP), not a simple Q&A task. That’s why we say: hiring is not a prompt-completion problem - it’s a game of inference under uncertainty.
Approach
We’re not building a smarter chatbot. We’re building a high-agency hiring agent trained to:
- Proactively seek evidence (real work, demos, behavior)
- Reason under partial information
- Adapt its strategies across roles, contexts, and candidate personas
- Learn from outcome feedback, not just token prediction
This architecture borrows ideas from reinforcement learning, active learning, simulation-based testing, and social signal processing.
Learning
In our system, the AI agent is not stateless. It:
- Forms a hypothesis about a candidate’s capabilities
- Selects a task or challenge to test those hypotheses
- Observes the execution and behavior of the candidate
- Revises its belief model and adapts the next interaction
- Simulates team fit and predicts downstream risk
This loop -> observe, intervene, update - is fundamentally a reinforcement learning cycle. Not in the game-playing sense, but in the deeply human sense of observation-led judgment.
You can't build this with just better prompts. You need memory. You need belief states. You need action-selection policies that work under uncertainty.
Signals
We train our agents not to look for perfection, but for evidence of promise - grit, clarity of thought, learning ability, cultural resonance. These are not checkboxes on a resume. They’re emergent traits revealed through proof-of-work tasks, real-time reasoning, and moments of struggle.
This is where current LLM-based systems fail: they collapse all human qualities into textual similarity. But great hires aren't similar to the training set. They're outliers - creative, nonlinear thinkers whose value is missed by shallow pattern matchers.
Stack
So what’s the architecture behind Spottable’s agent?
- LLMs for language understanding: We still use LLMs as the backbone for parsing candidate input, generating responses, and simulating soft prompts
- Memory and Belief Modules: Agents hold a candidate state - not just a token history, but a structured profile of what has been inferred, observed, and tested
- Task Planning Engine: Rather than serve static questions, the agent selects the next task adaptively based on uncertainty and entropy in its current belief model
- Feedback Integration: We close the loop by measuring real-world hiring outcomes, peer evaluations, and performance data (when available) to train reward models
- Human-AI Symbiosis: Recruiters don’t disappear - they collaborate. The agent surfaces evidence, flags risk, and explains its rationale in human-understandable terms
Clarity
The problem we’re solving isn’t to replace MCQs or interviews with AI-flavored versions. It’s to build an intelligent co-pilot that understands people in motion, not just on paper.
Hiring is the last frontier of human judgment in business. It requires taste, trust, and intuition. We’re building systems that don’t mimic these superficially but develop them over time - through iterative feedback, simulation, and alignment with organizational values.
Scale
Hiring is not just broken - it’s expensive, error-prone, and biased.
Companies spend billions on sourcing, interviewing, and assessing - with poor outcomes:
- 40% of new hires underperform
- Hiring cycles stretch over 45 days
- Interviews are often poor predictors of on-the-job success
If we can compress the signal collection, improve predictive accuracy, and surface high-agency candidates earlier, we don’t just improve hiring. We unlock human potential at scale.
Future
Most AI systems today are passive. They respond. They reflect. They summarize.
But the future will be different.
The most valuable agents will be those who act, adapt, and intervene - not just in pixels or tokens, but in real-world human workflows. That’s what Spottable is building.
And that’s why hiring is not just a business function. It’s a hard AI problem. One that requires us to go beyond LLMs, beyond benchmarks, and into the messy, beautiful territory of understanding people - as they are, not as they present.
This is a long journey. But it’s one worth taking.
We’re betting on it.
Hiring may seem like a simple act - evaluate a candidate, match them to a role, and extend an offer. But underneath this human ritual is a set of decisions that are remarkably hard to automate. We’re not just matching resumes to job descriptions; we’re asking a fundamentally human question: Who can I trust to build the future with me?
At Spottable, we’re building AI agents that don’t just read or listen - they observe, infer, simulate, and collaborate. These are not passive systems generating plausible sentences. They are active agents trained to solve one of the most consequential problems in business: hiring the right people without relying on brittle interviews or noisy proxies.
But there’s a catch.
Current LLMs, even the most advanced ones, hit a ceiling when it comes to solving this. And that ceiling isn’t just about tokens, compute, or clever prompting. It’s about the limits of passive intelligence.
Why LLMs Alone Can’t Hire
LLMs are probabilistic next-word engines trained on human-generated data. They excel at synthesis, translation, and contextual conversation. But they make several assumptions that break down in real-world hiring:
1. They Assume the World Is Static
Hiring is not a frozen document. It’s an evolving context where candidates react, companies shift direction, and expectations change mid-process. LLMs, by default, don’t adapt to interactional drift - the subtle shifts in tone, intent, and priority that define a high-stakes decision like hiring.
2. They Confuse Coherence with Competence
Candidates can sound great on paper - or in a generated interview answer. But sounding coherent isn’t the same as being competent. Traditional LLM evaluations reward linguistic fluency, not situational reasoning. Hiring demands the ability to assess judgment, reaction, and decision-making under ambiguity.
3. They Lack Agency
LLMs are not agents. They do not choose what to observe next. They don’t explore, revise, or take actions based on feedback loops unless embedded into a more complex system. Real hiring requires back-and-forth interaction, environmental feedback, and behavioral cues - none of which are handled well by frozen, stateless models.
What Makes Hiring Hard Technically?
From a systems perspective, hiring is a sequential, multi-modal, multi-agent decision problem.
- It involves latent variables (true ability, hidden intentions).
- It requires sparse signals (a few minutes of demo work, one recorded answer).
- It is judged on long-term outcomes (will this person perform well in 6 months?).
- It demands intervention-aware modeling (how the hiring process itself changes the candidate's behavior).
This makes it a Partially Observable Markov Decision Process (POMDP), not a simple Q&A task. That’s why we say: hiring is not a prompt-completion problem - it’s a game of inference under uncertainty.
What Spottable Is Building
We’re not building a smarter chatbot. We’re building a high-agency hiring agent trained to:
- Proactively seek evidence (real work, demos, behavior).
- Reason under partial information.
- Adapt its strategies across roles, contexts, and candidate personas.
- Learn from outcome feedback, not just token prediction.
This architecture borrows ideas from reinforcement learning, active learning, simulation-based testing, and social signal processing.
RL Loops Over Prompt Chaining
In our system, the AI agent is not stateless. It:
- Forms a hypothesis about a candidate’s capabilities.
- Selects a task or challenge to test those hypotheses.
- Observes the execution and behavior of the candidate.
- Revises its belief model and adapts the next interaction.
- Simulates team fit and predicts downstream risk.
This loop - observe, intervene, update - is fundamentally a reinforcement learning cycle. Not in the game-playing sense, but in the deeply human sense of observation-led judgment.
You can't build this with just better prompts. You need memory. You need belief states. You need action-selection policies that work under uncertainty.
Measuring What Truly Matters
We train our agents not to look for perfection, but for evidence of promise - grit, clarity of thought, learning ability, cultural resonance. These are not checkboxes on a resume. They’re emergent traits revealed through proof-of-work tasks, real-time reasoning, and moments of struggle.
This is where current LLM-based systems fail: they collapse all human qualities into textual similarity. But great hires aren't similar to the training set. They're outliers - creative, nonlinear thinkers whose value is missed by shallow pattern matchers.
The New Stack: LLMs + RL + Human Feedback Loops
So what’s the architecture behind Spottable’s agent?
- LLMs for language understanding: We still use LLMs as the backbone for parsing candidate input, generating responses, and simulating soft prompts.
- Memory and Belief Modules: Agents hold a candidate state - not just a token history, but a structured profile of what has been inferred, observed, and tested.
- Task Planning Engine: Rather than serve static questions, the agent selects the next task adaptively based on uncertainty and entropy in its current belief model.
- Feedback Integration: We close the loop by measuring real-world hiring outcomes, peer evaluations, and performance data (when available) to train reward models.
- Human-AI Symbiosis: Recruiters don’t disappear - they collaborate. The agent surfaces evidence, flags risk, and explains its rationale in human-understandable terms.
This Is Not Just “Assessment Tech”
The problem we’re solving isn’t to replace MCQs or interviews with AI-flavored versions. It’s to build an intelligent co-pilot that understands people in motion, not just on paper.
Hiring is the last frontier of human judgment in business. It requires taste, trust, and intuition. We’re building systems that don’t mimic these superficially but develop them over time—through iterative feedback, simulation, and alignment with organizational values.
Why This Is A Billion-Dollar Problem
Hiring is not just broken - it’s expensive, error-prone, and biased.
Companies spend billions on sourcing, interviewing, and assessing - with poor outcomes:
- 40% of new hires underperform.
- Hiring cycles stretch over 45 days.
- Interviews are often poor predictors of on-the-job success.
If we can compress the signal collection, improve predictive accuracy, and surface high-agency candidates earlier, we don’t just improve hiring. We unlock human potential at scale.
Final Thoughts
Most AI systems today are passive. They respond. They reflect. They summarize.
But the future will be different.
The most valuable agents will be those who act, adapt, and intervene - not just in pixels or tokens, but in real-world human workflows. That’s what Spottable is building.
And that’s why hiring is not just a business function. It’s a hard AI problem. One that requires us to go beyond LLMs, beyond benchmarks, and into the messy, beautiful territory of understanding people - as they are, not as they present.
This is a long journey. But it’s one worth taking.
We’re betting on it.