How We Review Procurement AI Agents

Q: Can vendors influence their score?

No. Vendors can submit factual corrections to pricing or feature data, which we will investigate and update if accurate. They cannot adjust scores or editorial framing.

Q: Do affiliate relationships affect scores?

No. Affiliate links are disclosed with rel=sponsored. Scoring criteria and weights are fixed and applied identically to all tools regardless of commercial relationship.

CRITERIA IN DETAIL

What we look for in each dimension

Procurement Fit

25% of total score

Procurement Fit is our most heavily weighted criterion because a tool that is not purpose-built for procurement workflows will underperform regardless of its general capabilities. We assess whether the product's AI models were trained on procurement data, whether its terminology matches procurement practice (categories, commodity codes, contracts, POs, GRNs), and whether it integrates natively into P2P processes rather than being adapted from a generic workflow or document tool.

Domain-specific AI training on procurement data, contracts, supplier databases, and commodity taxonomies

Native support for procurement processes: RFx, auction, catalogue, spot buy, contract, PO, GRN, invoice

Procurement-specific reporting: spend under management %, maverick spend rate, savings delivered, supplier compliance

Evidence of procurement practitioner involvement in product design (not a generic tool adapted for procurement)

Track record with procurement teams at companies of comparable size and industry

Features

20% of total score

We evaluate the depth and breadth of procurement-relevant features. Breadth matters because procurement teams operate across multiple sub-processes; depth matters because a feature that exists in name but delivers 60% accuracy or requires significant manual correction provides limited value. We specifically test or verify: spend classification accuracy against UNSPSC/eCl@ss, three-way invoice matching rates, contract data extraction precision, supplier risk signal quality, and sourcing event automation completeness.

Spend classification: UNSPSC/eCl@ss accuracy rates, re-classification capability, taxonomy customisation

Contract management: clause extraction, obligation tracking, auto-renewal alerts, risk flagging

Invoice processing: OCR accuracy, three-way match automation rate, exception handling workflow

Supplier management: risk scoring methodology, financial health signals, sustainability/ESG data

Sourcing: RFx builder, e-auction types (reverse, combinatorial), bid analysis, award scenario modelling

AI / ML transparency: explainability, confidence scores, human-in-the-loop controls

Pricing

15% of total score

Hidden pricing is a procurement problem in itself. We score vendors on how clearly they communicate pricing, whether starting prices are publicly available, and whether the total cost of ownership (implementation, connectors, training, per-transaction charges) is disclosed upfront. Enterprise-only pricing is not penalised if it is clearly explained and a range is provided. "Contact sales" with no indication of scale is penalised.

Public pricing page with actual tier breakdowns (not just "contact us")

Transparent implementation costs, onboarding fees, and time-to-value estimates

Clear statement of what is and isn't included at each tier

Honest communication about connector costs, API limits, overage charges

Availability of a free tier, trial period, or sandbox environment

ERP Integration Depth

15% of total score

Most procurement teams operate within a broader ERP landscape. An AI tool that requires significant middleware or custom development to connect to SAP, Oracle, or Workday creates integration debt that negates much of the claimed efficiency gain. We assess whether integrations are native/certified, what data flows bidirectionally, and how synchronisation is handled. We specifically check the depth of integration with the six most common enterprise ERP and procurement platforms.

SAP S/4HANA and SAP Ariba: native connector quality, real-time vs. batch sync, data scope

Oracle Cloud ERP and Oracle E-Business Suite: certified status, supported modules

Workday Financial Management: integration depth, procurement module coverage

Microsoft Dynamics 365: F&O and Business Central connectivity

API quality: REST API availability, webhook support, developer documentation completeness

Middleware dependency: whether Boomi, MuleSoft, or similar is required and who bears that cost

Ease of Use

15% of total score

Adoption drives value. A tool that procurement analysts find difficult to use or that requires extensive training before delivering productivity gains scores poorly here, regardless of feature depth. We assess UX through demo testing, user feedback from procurement professionals, and proxy indicators including NPS data, implementation timelines reported by customers, and G2 / Gartner Peer Insights review patterns specifically from procurement roles.

Procurement analyst productivity: time-to-insight for spend dashboards, contract search, supplier lookup

Self-service configuration vs. IT dependency for workflow changes

Mobile experience for approvals and PO management

Onboarding timeline: time from contract signature to first live transaction

Training requirement: hours to baseline competency for a procurement analyst

Support Quality

10% of total score

Procurement platforms are mission-critical. Invoice processing failures, sourcing event outages, and contract search downtime have direct financial consequences. Support scoring covers the accessibility and competence of technical support, the quality of the customer success function, and the availability of procurement-domain expertise within the vendor's support organisation — not just generic software support.

SLA commitments: response times, uptime guarantees, incident severity classification

Dedicated customer success manager: available at which tier?

Procurement domain expertise within support team (not just software support)

Knowledge base quality: procurement-specific documentation, playbooks, integration guides

Community: peer user forums, procurement practitioner networks, user conferences

FREQUENTLY ASKED

Questions about our methodology

How often are reviews updated?

We update reviews when vendors make significant pricing, feature, or integration changes — typically within 30 days of a major announcement. All reviews are checked and refreshed at minimum every 12 months, with the review date shown at the top of each page.

Can vendors influence their score?

No. Vendors can submit factual corrections to pricing, integration lists, or feature descriptions through our contact form. We investigate each correction and update the review if the claim is accurate. They cannot adjust scores, change editorial framing, or request removal of negative observations.

Do affiliate or advertising relationships affect scores?

No. Affiliate links are clearly disclosed with rel="sponsored". Advertising packages (category sponsorship, newsletter placements) are disclosed where applicable. Scoring criteria and weights are fixed and applied identically to all tools regardless of commercial relationship. A vendor who advertises here will receive the same score as one who doesn't — their score is determined by procurement performance, not revenue.

Why does Procurement Fit carry the highest weighting?

Because procurement teams have learned the hard way that generic workflow, document management, or spend management tools that are "adapted for procurement" consistently underperform against purpose-built platforms. A tool with impressive features but shallow procurement domain understanding will fail in areas that matter — UNSPSC taxonomy, three-way matching, contract obligation tracking, and supplier risk nuance. Procurement Fit at 25% reflects this reality.

How do you handle tools that are pre-launch or in beta?

We do not publish full reviews of tools that have not shipped a generally-available product. We may publish brief "ones to watch" profiles noting the tool's category and claimed capabilities, clearly labelled as pre-launch. Full review scoring requires a production-ready product that procurement teams can actually purchase and deploy.

What does a score of 8.0+ indicate?

Scores of 8.0–10.0 indicate tools we consider best-in-class for procurement teams: strong ERP integration, accurate spend classification, transparent pricing, and genuine procurement domain expertise. Scores of 6.0–7.9 indicate capable tools with specific strengths, suitable for the right use case. Scores below 6.0 indicate tools with significant procurement-specific limitations that may be acceptable for some use cases but should be approached carefully. We've never published a review that's simply "marketing for a vendor" — if the tool doesn't reach 6.0, we say so clearly.

How We Score Procurement AI Agents

The seven criteria at a glance

What we look for in each dimension

How a review is produced

Questions about our methodology

Find the right procurement AI for your team