Back to home

Methodology

How we measure what you actually learned.

Most platforms don't share their measurement methodology. We do because every verified credential is only as trustworthy as the measurement underneath it. Procurement reviewers, accreditors, and serious L&D buyers should be able to read our methodology without an NDA.

Last updated 2026-05-07 · Permalink: https://my.aveluate.com/methodology

Three layers of measurement

No single framework does everything well. We layer three so each layer earns trust from a different audience without overreaching beyond what it's actually good at.

Layer 1

Curriculum design — Bloom's Taxonomy (revised 2001)

Live in production

Every quiz question Aveluate AI generates carries a bloom_level tag from the Anderson & Krathwohl 2001 revised taxonomy:

1. Remember
Recall facts
2. Understand
Explain concepts
3. Apply
Use in new situations
4. Analyze
Break down + relate parts
5. Evaluate
Judge / critique
6. Create
Produce new work

When the platform pairs a pre-assessment with a post-assessment, an automated validator (BloomEquivalenceService) scores their similarity using a documented weighted formula:

  • 40% — Bloom distribution similarity (do both assessments hit the same cognitive levels in the same proportions?)
  • 30% — Difficulty distribution similarity (beginner / intermediate / advanced mix)
  • 30% — Learning-objective coverage (same topics tested)

That score is what makes it honest to call the pair Bloom-equivalent: pre/post deltas reflect actual learning rather than easier questions on the second attempt.

Layer 2

Item scoring — Classical statistics today, IRT in progress

Partially shipped · IRT on roadmap

Today (verified-real):items are scored with percentage-correct against the threshold encoded in the quiz's passing_score (configurable per quiz, default 70%). Question difficulty is tracked as an authored field (beginner / intermediate / advanced) and surfaced on every result page.

What we don't do yet:we do not currently run Item Response Theory (IRT) calibration. That means we don't compute or display percentile rankings against other learners, because percentiles without IRT calibration are statistically meaningless across non-equivalent items.

On the roadmap: 1PL Rasch IRT calibration as individual items reach ~200 attempts (the minimum for stable parameter estimation). Target public release: Q4 2026. Until then, we report raw scores + Bloom-level breakdowns only.

Layer 3

Workforce framework alignment — per skill, per vertical

On roadmap

For B2B buyers (federal HR, EU education ministries, enterprise CIOs), the most trusted methodology is the one their own procurement teams already use. We're mapping verified skills to industry frameworks rather than inventing our own:

  • NICE Workforce Framework for Cybersecurity (NIST SP 800-181) — required by US federal cyber roles. Cybersecurity skill mapping in scoping; target release Q3 2026.
  • DigComp 2.2 — EU digital competence framework. Recognised by EU governments and universities. On roadmap.
  • SFIA 8 — Skills Framework for the Information Age. Used in 200+ countries by IT consultancies and enterprise HR. On roadmap.
  • OPM KSAs — US federal Knowledge / Skills / Abilities standard. On roadmap pending /governments segment demand.

Mappings will be documented per skill on the public skill page, not claimed wholesale. A skill is mapped or it isn't — no middle ground.

What we measure today (verified-real)

Each item below maps to working code in the backend. Audit trail available on request (procurement reviewers, see contact below).

  • Bloom-distribution similarity score per pre/post pair40% weight in BloomEquivalenceService.score_equivalence()
  • Difficulty distribution similarity score30% weight, beginner/intermediate/advanced bands
  • Learning-objective coverage similarity30% weight, topic-tag overlap
  • Pre→post raw score delta per learnerQuizAttempt model + computed delta
  • Per-skill breakdown of correct vs incorrectQuestion.skill FK + tagged on every attempt
  • Mistake review with AI-curated remediation playlistMistakeReviewSession service
  • Proctoring integrity score (dual-camera + behavioural signals)ProctoringSession + violation point system
  • Verified-credential public verify URL/badge/[badgeId] — auditable by employers

What we don't claim (and why)

Visible boundaries are a trust signal. Here's what we're explicitly not — so you don't have to discover it in a security review.

"Adaptive testing" / Computerized Adaptive Testing (CAT)

Our quizzes adapt within difficulty bands but are not full IRT-driven CAT. We don't claim CAT capability.

"Industry-standard percentile rankings"

Without IRT calibration, percentiles aren't comparable across non-equivalent items. We don't compute or display them today.

"Predictive of job performance"

No predictive-validity studies have been completed. We don't make causal claims about hiring outcomes from verified scores.

"ISO 17024 / NCCA accredited certification"

We are not an accredited certification body. Aveluate AI badges are verified credentials with auditable provenance — not professional certifications. The distinction matters in regulated industries.

"SAML SSO / SCIM 2.0 / EU multi-region / SOC 2 Type II"

See /changelog. Email-domain auto-join + invite codes today; SOC 2-aligned posture but not formally audited. Roadmapped honestly.

"Bloom-certified" or "Bloom-approved"

No body certifies in Benjamin Bloom's name (he died in 1999). We say Bloom-aligned and Bloom-equivalent because those are accurate.

Citations

  • Anderson, L. W., & Krathwohl, D. R. (Eds.). (2001). A Taxonomy for Learning, Teaching, and Assessing: A Revision of Bloom's Taxonomy of Educational Objectives. Longman.
  • Bloom, B. S. (1956). Taxonomy of Educational Objectives: The Classification of Educational Goals. Handbook I: Cognitive Domain. David McKay.
  • Webb, N. L. (1997). Criteria for Alignment of Expectations and Assessments in Mathematics and Science Education. Council of Chief State School Officers and National Institute for Science Education.
  • de Ayala, R. J. (2009). The Theory and Practice of Item Response Theory. Guilford Press.
  • National Institute of Standards and Technology. (2020). NIST SP 800-181 Rev. 1: Workforce Framework for Cybersecurity (NICE Framework).
  • European Commission, Joint Research Centre. (2022). DigComp 2.2: The Digital Competence Framework for Citizens.
  • SFIA Foundation. (2024). SFIA 8 — Skills Framework for the Information Age.

For procurement / RFP reviewers

This page is the canonical methodology source. Permalink it directly in your RFP responses — we don't move it. For the backend audit trail (BloomEquivalenceService source, item-statistics export, integrity-scoring details), get in touch: