Ajoxi
  • Pillar
    CLOUD PHONE

    Cloud phone, messaging, video, fax, chat — the full UCaaS stack.

    • Business PhoneCalling, SMS, video, one login
    • Customer EngagementEvery channel, one thread
    • Personal AIAI sidekick for every rep
    • SMS & MMSText from the main business line
    • Team ChatInternal chat, tied to customers
    • Video MeetingsRooms with AI notes + recap
    • Online FaxFax without the fax machine
    • Website ChatbotAuto-resolves order status & returns
    • Phone SystemModern PBX with AI built in
    Featured
    Everything included.
    Cloud phone, AI contact center, AI Receptionist, SMS, video, 300+ integrations.
    See plans & pricing
  • Core Capabilities
    • AI Receptionist24/7 first answer · 32 languages
    • AI SentimentRoutes upset callers automatically
    • AI Agent AssistWhisper scripts + next-best-action
    • Conversation IntelligenceTranscripts, sentiment, objections
    • Call RecordingFull fidelity + keyword search
    • Auto-attendantDrag-and-drop visual IVR builder
    • Supervisor ToolsListen · whisper · barge · audit log
    • Toll-free Numbers800, 888, 877 — provisioned fast
    New
    AI Sentiment · live scoring.
    Routes upset customers to senior agents the moment sentiment dips. On every paid plan.
    See AI Sentiment
  • By Industry & Team
    • FinanceSOC 2 · FINRA-ready audit trails
    • RetailOmnichannel + cart-recovery SMS
    • SaaSAPIs + Personal AI on every seat
    • LogisticsMulti-site dispatch routing
    • Sales TeamsPower dialer + live AI coaching
    • Support TeamsShared memory across 8 channels
    • Remote TeamsSame number on every device
    • SMBAI receptionist as your front desk
    • Enterprise ITSSO, SCIM, multi-site governance
    Most adopted
    A calling stack compliance trusts.
    Call recording, STIR/SHAKEN, sentiment routing. SOC 2, PCI, and FINRA-ready audit trails.
    See finance
  • Native Sync
    • HubSpotTwo-way sync · lifecycle triggers
    • ZohoCRM · Desk · Books · Bigin
    Coming soon
    Salesforce. Pipedrive. Freshsales.
    All three native two-way syncs in Q3 2026. Want a heads-up on launch?
    Email me on launch
  • Pricing
  • Learn
    • BlogEngineering & product notes
    • Customer storiesReal outcomes, real numbers
    • GuidesStep-by-step playbooks
    • WebinarsLive every Thursday · on-demand
    • Contact UsTalk to sales or get support
    Build
    • DocsHow everything works
    • API referenceREST + webhooks
    • SDKsNode, Python, Go, Ruby
    • ChangelogEvery ship, in one place
    Trust
    • Status pageLive uptime + incidents
    • Security + complianceSOC 2 · GDPR · PCI
    • PrivacyWhat we collect & why
    • TermsThe contract, in chapters
    Fresh ink
    8,400 calls, measured.
    AI receptionist accuracy by language, accent, and call type — the unedited numbers.
    Read the post
Sign inFree Trial
Cloud Phone
Business PhoneCalling, SMS, video, one loginCustomer EngagementEvery channel, one threadPersonal AIAI sidekick for every repSMS & MMSText from the main business lineTeam ChatInternal chat, tied to customersVideo MeetingsRooms with AI notes + recapOnline FaxFax without the fax machineWebsite ChatbotAuto-resolves order status & returnsPhone SystemModern PBX with AI built in
Contact Center
OmnichannelOne queue for every channelOutbound DialerPredictive, power, previewAgent AssistLive whisper coachingSupervisor AssistSpot bad calls in real timeInteraction AnalyticsAuto-QA, topic trendsEnterprise500+ seat operations
AI Family
Ajoxi VoiceAI Receptionist that books appointmentsAI AssistantDrafts, summaries, follow-upsConversation AIReads every call so you don't miss a thing
AI Receptionist24/7 first answer · 32 languagesAI SentimentRoutes upset callers automaticallyAI Agent AssistWhisper scripts + next-best-actionConversation IntelligenceTranscripts, sentiment, objectionsCall RecordingFull fidelity + keyword searchAuto-attendantDrag-and-drop visual IVR builderSupervisor ToolsListen · whisper · barge · audit logToll-free Numbers800, 888, 877 — provisioned fast
FinanceSOC 2 · FINRA-ready audit trailsRetailOmnichannel + cart-recovery SMSSaaSAPIs + Personal AI on every seatLogisticsMulti-site dispatch routingSales TeamsPower dialer + live AI coachingSupport TeamsShared memory across 8 channelsRemote TeamsSame number on every deviceSMBAI receptionist as your front deskEnterprise ITSSO, SCIM, multi-site governance
HubSpotTwo-way sync · lifecycle triggersZohoCRM · Desk · Books · Bigin
Learn
BlogEngineering & product notesCustomer storiesReal outcomes, real numbersGuidesStep-by-step playbooksWebinarsLive every Thursday · on-demandContact UsTalk to sales or get support
Build
DocsHow everything worksAPI referenceREST + webhooksSDKsNode, Python, Go, RubyChangelogEvery ship, in one place
Trust
Status pageLive uptime + incidentsSecurity + complianceSOC 2 · GDPR · PCIPrivacyWhat we collect & whyTermsThe contract, in chapters
Sign inFree Trial
Ajoxi

Cloud phone and AI contact center on one carrier-grade network.

SOC 2GDPRPCI-DSS

Cloud Phone

  • Business Phone
  • Customer Engagement
  • SMS & MMS
  • Team Chat
  • Video Meetings
  • Phone System

Contact Center

  • Omnichannel
  • Outbound Dialer
  • Agent Assist
  • Interaction Analytics
  • Enterprise CCaaS

Wholesale

  • Wholesale VoIP
  • Wholesale Voice
  • SIP Trunking
  • CLI Routes

AI

  • AI Platform
  • AI Receptionist
  • AI Assistant
  • Conversational AI
  • AI Sentiment
  • Conversation Intelligence

Solutions

  • Finance
  • Retail & eCom
  • SaaS & Tech
  • Sales Teams
  • SMB

Company

  • Pricing
  • About
  • Customers
  • Contact Us
  • Country Codes
  • Area Codes
  • Docs
  • Status
  • Security

© 2026 Ajoxi. All rights reserved.

All systems normal
  • Privacy
  • Terms
  • Security
Blog/Product/The case for ranking calls, not sampling them

The case for ranking calls, not sampling them

Random sampling misses the calls that actually matter. We rebuilt the supervisor console around a risk score — and stopped pretending QA was a numbers game.

Table of Contents
  • 1.Introduction
  • 2.What changed in the past 18 months
  • 3.What goes into the score
  • 4.The supervisor console
  • 5.The objections we heard
  • 6.What the new system found
  • 7.What we will not do

Introduction

For two decades the contact-center industry agreed on a process: a supervisor would randomly sample a small percentage of recorded calls — typically 2 to 5 per agent per week — score them against a 30-row rubric, and roll the scores up into a coaching report. The whole industry was built around this practice. Vendors sold scorecards. Auditors validated rubrics. Conferences had tracks for it.

The math never worked. A 1,000-agent contact center handling 200,000 calls a week, sampling 2% per agent, gives you 40,000 scored calls — which sounds like a lot until you realise that the sample is uniformly random across calls that are 99% benign. Sampling 40,000 random conversations to find the 60 calls where the agent committed compliance violations is the definition of a needle in a haystack. By the time the haystack has been sorted, the agent has moved on, the customer has churned, and the violation has compounded.

The point of QA was never to score the average call. It was to find the calls that mattered — the failures, the saves, the edge cases, the conversations where the agent did something exceptional or alarming. Random sampling is structurally bad at finding any of those.

What changed in the past 18 months

Two things changed enough to make the old model obviously broken. First, transcription quality crossed a usefulness threshold — across most major languages, transcripts are now reliable enough to score with software, not just with human ears. Second, large language models got good enough at supervised classification that a custom-trained model can score a call on 30 dimensions in 4 seconds for less than a penny.

Together these meant that for the first time in contact center history, every single call could be scored against the rubric — not 2%, not 5%, but 100%. Once every call is scored, the question stops being "which calls do we sample?" and becomes "which calls deserve a human supervisor's attention?"

That is the question worth solving, and it is a ranking problem, not a sampling problem.

What goes into the risk score

We score every call on three independent dimensions, then combine them into a single rank that the supervisor sees in their console.

  • Compliance risk — did the call contain language that triggers regulatory exposure? Mini-Miranda, TCPA consent, debt-collection FDCPA boundaries, protected health disclosures. This dimension is binary-ish — most calls score near zero, a small tail scores high.
  • Outcome risk — is this customer likely to churn, complain, or escalate as a result of this call? Combines sentiment trajectory, unresolved-issue signals, explicit complaint language, and the customer's account-tier value.
  • Coaching value — would a supervisor watching this call learn something they can teach? High-rank coaching calls are the unusual saves, the clean handoffs, and the controlled de-escalations. They are not failures; they are exemplars.

The three dimensions are not weighted equally and the weights are not the same across customers. A debt-collection operator weights compliance risk heaviest. A high-touch enterprise SaaS support team weights outcome risk. A training-heavy onboarding team weights coaching value. The weights are exposed in the supervisor settings and we set sensible defaults per industry.

The supervisor console: a ranked queue

The console looks deliberately different from the old "random sample queue" UI. Instead of a paginated list of recent calls, it is a single ranked queue, sorted by combined risk score, refreshing every two minutes. The top of the queue is the 20 or so calls that need attention today. Everything below is a long tail.

Each call card carries the three sub-scores, a 90-second summary of what the call was about, and the specific snippets the model flagged. A supervisor can listen to the snippet without listening to the full call. If the snippet is the whole story — and most of the time it is — the supervisor confirms or rejects the flag in 30 seconds and moves on.

Random-sample QA used to take 8 minutes per call on average. The ranked queue averages 2 minutes 40 seconds. The cost of QA per call has fallen. The coverage has gone from 2% to 100%. The supervisor is spending their time on the calls that actually moved the metric.

The objections we heard

When we showed the ranked queue to QA leaders at the design-partner customers, three objections came up consistently.

Objection 1: "The model will miss things a human would catch."

True in the abstract; mostly false in practice. The model misses things a human specialist would catch — a compliance auditor reviewing the same call could find subtleties the model does not flag. But the comparison is not against a specialist. It is against a supervisor randomly sampling 2% of calls. The model reviews 100% and flags the obvious 5%. The specialist reviews the 5%. Net coverage is far better than the old system.

Objection 2: "Agents will game the score."

Probably true. Any metric that gets attention gets gamed. The defence is twofold: the score is multi-dimensional, so gaming one axis pushes you up another; and the score is not the agent's performance review. The score is a triage signal for the supervisor. The performance review is what the supervisor concludes after they listen to the actual call. We sell the score as a queue, not a scoreboard.

Objection 3: "It changes the supervisor's job."

Also true, and we should be honest about it. The supervisor of a ranked-queue contact center spends less time scoring calls against a 30-row rubric and more time coaching, escalating, and intervening. The supervisors who liked the old job — the methodical, scorecard-driven part of it — are not necessarily thrilled with the new one. We talked openly to design-partner ops leaders about this before launch. It is a workflow change, not just a tool change.

What the new system found in the first 90 days

Across the four design partners who ran the ranked queue exclusively for 90 days, the supervisor teams escalated 4.7x more calls per week to retention or compliance than they had under random sampling. The increase was not a "more scrutiny" effect; it was almost entirely calls that random sampling had structurally missed.

One customer caught a debt-collection script drift — a single agent had started using language that crossed an FDCPA boundary, on roughly 9% of their calls — that random sampling had not surfaced for 11 weeks. The ranked queue surfaced it within 48 hours of the drift starting, because the compliance-risk score on those specific calls jumped two standard deviations above the agent's baseline.

Another customer found that the agents the random-sample system had flagged as the lowest performers were not, in fact, the lowest performers — they were just the loudest. The ranked queue, which rated based on outcome risk, identified two quiet but consistently underperforming agents whose calls had been sampled at the same rate as everyone else and had landed in the "average" pile.

What we deliberately will not do with the score

Three things, on purpose.

We will not auto-score agents on the queue's output. The score is a triage signal. Agent performance ratings still go through a supervisor. We are deliberately keeping the human in the loop, not because the model cannot do it, but because we have watched enough autoscored systems erode trust to want to avoid the pattern entirely.

We will not surface the agent's real-time score to the agent during the call. There is a school of thought that says agents should see their own coaching scores in real time. There is a stronger school of thought that says doing so degrades the call. We side with the second school.

We will not export the score to performance-management systems by default. Customers who want to wire the score into a separate HR tool can do so with explicit configuration, but we make the default the safer choice. If you want a number on a spreadsheet, you should have to ask for it on purpose.

Run your voice on Ajoxi.

AI receptionists, wholesale routes, virtual numbers — built on one platform with transparent pricing and a 24/7 NOC.

See pricing Talk to us
Keep Reading

Related reading

Hand-picked next reads from the Ajoxi blog.

We measured AI receptionist accuracy across 8,400 real calls
AI

We measured AI receptionist accuracy across 8,400 real calls

For three months we tracked every call the AI handled — by language, by accent, by call type — and graded the transcript against a human reviewer. The accuracy numbers were better than we expected. The failure modes were more interesting.

Read article
Why we ship STIR/SHAKEN attestation on day one
Compliance

Why we ship STIR/SHAKEN attestation on day one

Most cloud-phone vendors treat caller-ID attestation as a higher-tier feature. Carriers do not. Here is why we made it default — and what it changed for outbound answer rates.

Read article
Same latency on Mandarin and English. Here is how
Engineering

Same latency on Mandarin and English. Here is how

Hitting parity across 32 languages without bloating the model required a model-routing layer we did not see coming. Notes from the latency war room.

Read article