Concepts Worth Having: Refining Concept Bottleneck Models with Minimal Annotations

May 13, 2026

arXiv 2605.16405 — with Andrea Passerini, Stefano Teso, Andrea Pugnana, Emanuele Marconato.

The Problem

Modern AI classifiers are powerful, but often opaque. Concept Bottleneck Models (CBMs) are a family of neural classifiers designed to fix this: instead of going straight from input to prediction, they first extract a set of human-interpretable concepts (e.g. "has stripes", "is fluffy") and then make a prediction based on those. This makes the reasoning transparent and auditable.

The catch? Learning good concepts requires concept-level annotations — someone has to label the data at that finer level of detail, not just assign class labels. These annotations are expensive and rarely available.

The VLM Shortcut (and its Limits)

Recent work has tried to sidestep this by using Vision-Language Models (VLMs) like CLIP to automatically generate concept annotations. This works surprisingly well, but comes with a cost: VLM-generated concepts tend to be noisier and less faithful to what a human expert would say. The model becomes less interpretable — which defeats the whole purpose of a CBM.

Our Approach: VH-CBM

We introduce VH-CBM (Vision-plus-Human-guided CBM), a hybrid that gets the best of both worlds:

  • Use a VLM to provide broad coverage and avoid the cold-start problem
  • Collect a tiny amount of dense human annotations (as little as 1% of the dataset)
  • Use a Gaussian Process in the VLM's embedding space to propagate those expert labels across the entire dataset

The Gaussian Process is the key insight: by operating in the VLM's embedding space, it captures the global structure of the domain and intelligently transfers the expert's feedback to unannotated data points — without requiring the expert to label everything.

Results

VH-CBM consistently outperforms VLM-only CBMs on concept accuracy and calibration, even with a fraction of the annotation budget. It also naturally supports active learning, allowing the system to ask the human expert for the most informative labels first.

The takeaway: you don't have to choose between scalability and interpretability. A small amount of expert guidance, applied strategically, goes a long way.

Nicola Debole