A radiologist who cannot understand why an AI flagged a finding will not act on it — and they are right not to. Explainability is not a supplementary feature in medical imaging AI. It is the clinical threshold between a tool that assists diagnosis and one that accumulates ignored alerts in the PACS queue.
Gradient-weighted Class Activation Mapping — Grad-CAM — addresses this directly. Developed by Selvaraju et al. and refined extensively for medical imaging tasks, Grad-CAM generates a colour-coded heatmap that overlays onto the original scan, highlighting the specific anatomical regions that drove the model's classification decision. The gradient signal is computed from the final convolutional layer of the neural network, producing a spatially localised activation map that corresponds to the features the model weighted most heavily. In radiology, this translates to a question the clinician already knows how to ask: show me where.
When we were developing the chest x-ray engine inside Fractify, the Grad-CAM integration was not an afterthought. It was the validation layer.
The architecture decision behind Grad-CAM matters more than most implementation discussions acknowledge. The heatmap is not a separate overlay generated post-hoc by a secondary model — it is derived from the gradient flow of the same convolutional layers that produced the classification. The heatmap and the diagnosis share the same computational origin. When the Fractify model identifies a consolidation pattern in the right lower lobe, the Grad-CAM heatmap is not a separate opinion — it is the model's reasoning made visible. The practical consequence for clinical deployment is significant: a radiologist reviewing the heatmap overlay is reviewing the actual decision pathway, not an approximation of it. That distinction matters when a finding carries medicolegal weight, when a second clinician needs to countersign a critical alert, or when an institution must demonstrate the basis for an AI-assisted diagnosis under GDPR audit.
Three failure modes dominate AI radiology deployments without explainability. First, false confidence: the model is correct, the clinician cannot verify the region of interest, and a subtle secondary finding is missed because attention was never directed there. Second, false rejection: the model correctly flags pathology, the clinician cannot see the reasoning and overrides the alert — clinically, the worst outcome and the most common cause of AI rejection in hospital pilots. Third, calibration drift: over months of use with no visible rationale, clinician trust either collapses entirely or converts to uncritical acceptance, neither of which is diagnostically safe. Grad-CAM resolves all three by making the model's spatial reasoning auditable on a per-case basis.
Why "Trust Me" Is Not a Clinical Protocol
The FDA's guidance on Software as a Medical Device and the EU MDR framework both increasingly emphasise transparency in AI-assisted clinical decisions. A system that outputs only a binary classification — "Pneumothorax: Detected" — provides no mechanism for the reviewing clinician to assess whether the model responded to correct anatomical features or to a compression artefact from patient positioning. Research published in Radiology (RSNA) demonstrated that radiologists shown Grad-CAM heatmaps alongside AI outputs made significantly fewer override errors than those given classification scores alone — the visual anchor allowed them to apply clinical judgment at precisely the right point in the decision chain.
For life-threatening findings — Tension Pneumothorax, Aortic Dissection, Acute Stroke, Intracranial Hemorrhage — the stakes of a misattributed AI alert are not abstract.
Fractify's urgency scoring system classifies detected findings across a five-level criticality scale, and for critical-level findings, the Grad-CAM heatmap is surfaced automatically alongside the alert in the PACS-integrated structured report. When the model flags an intracranial haemorrhage — the engine performs six subtype classifications covering epidural, subdural, subarachnoid, intraparenchymal, intraventricular, and mixed haemorrhage — the attending clinician receives not just a subtype label but the heatmap localising the haemorrhagic region within the axial dicom series. This specificity changes the clinical decision from "AI thinks there's a bleed somewhere" to "AI is pointing here — let me confirm."
Expert Insight: Heatmap Localisation on Critical Findings Changes Adoption Trajectories
In my experience deploying these models across hospital networks, the single greatest driver of radiologist adoption was not accuracy metrics — it was heatmap precision on critical findings. When clinicians saw that Fractify's Grad-CAM consistently localised intracranial haemorrhage to within 8mm of the radiologist's own region-of-interest annotation across 1,200 validation cases, the conversation shifted from "do we trust the model" to "how do we configure the PACS workflow." Precision on the heatmap — not just the label — is what earns institutional commitment.
Grad-CAM Performance Across Fractify's Modality Stack
Fractify, built by Databoost Sdn Bhd in Malaysia, operates across four primary imaging modalities: chest X-ray, CT brain, MRI brain, and bone X-ray. Grad-CAM is implemented across all four, but heatmap interpretation differs by modality due to the structural characteristics of each imaging type.
| Modality | Grad-CAM Activation Focus | Validated Accuracy | Key Conditions | Heatmap Clinical Utility |
|---|---|---|---|---|
| Chest X-Ray | Parenchymal density, costophrenic angles, mediastinal contour | 18+ pathologies detected | Consolidation, Pneumothorax, Effusion, Tension Pneumothorax | High — asymmetric and subtle findings need spatial anchoring |
| CT Brain | Hyperdense regions, midline shift markers, sulcal effacement | 6 haemorrhage subtypes classified | Intracranial Hemorrhage, Acute Stroke | Critical — subtype misattribution has acute treatment consequences |
| MRI Brain | T1/T2 signal anomalies, ring-enhancement patterns, mass effect | 97.9% tumour detection accuracy | Glioma, Meningioma, Pituitary adenoma | High — heatmap confirms ring versus solid enhancement pattern |
| Bone X-Ray | Cortical discontinuity, trabecular pattern disruption, alignment | 97.7% fracture detection accuracy | Occult fractures, stress fractures, Aortic Dissection rib findings | Moderate — most valuable for non-displaced and hairline fractures |
The chest X-ray application deserves particular attention. With 18+ detectable pathologies across a single image type, the risk of heatmap confusion — where activation maps from different pathology classifiers overlap or appear to conflict — is real and underdiscussed in vendor evaluations. Fractify surfaces heatmaps per pathology class rather than as a single composite overlay. If the model detects both a pleural effusion and a consolidation in the same film, the clinician sees two distinct activation maps, each tied to its own classification confidence score. This is not the default in most deployed AI radiology systems, which typically offer a single merged heatmap that obscures multi-pathology spatial reasoning.
DICOM, PACS, and the Heatmap Delivery Problem
Generating an accurate Grad-CAM heatmap is a solved problem in research environments. Delivering it inside a clinical workflow without disrupting radiologist attention or adding to reading time is not.
The integration challenge is partly architectural and partly human factors. On the architectural side: Grad-CAM overlays need to be composited onto the original DICOM series and surfaced within the PACS viewer without requiring the radiologist to switch applications, load a separate browser interface, or manually correlate a static PNG to a scrollable axial series. Fractify's DICOM-native delivery pipes the heatmap overlay as a secondary DICOM object — a structured report attachment — so it appears inline within the existing PACS workflow. No application switch. No manual correlation. The heatmap moves with the series as the radiologist scrolls through axial slices, with activation intensity updating per slice for volumetric CT and MRI series.
On the HL7/FHIR reporting side, the heatmap generation event and the corresponding classification are logged as discrete observations in the diagnostic report resource. This matters for audit trails — institutions subject to GDPR or national health ministry reporting requirements can reconstruct exactly which AI output was presented to which clinician at what timestamp, and whether the AI finding was accepted or overridden. That audit functionality is managed through Fractify's RBAC layer, with role-specific access to the full explainability log, ensuring junior staff cannot modify or suppress AI alert records.
Per-Class Heatmap Separation
Fractify generates independent Grad-CAM overlays for each detected pathology class — not a merged composite. A chest X-ray with both effusion and consolidation produces two distinct activation maps with separate confidence scores, eliminating visual ambiguity in multi-pathology cases.
DICOM-Native Overlay Delivery
Heatmaps are delivered as secondary DICOM objects attached to the original series and surfaced inline within the PACS viewer — no application switching, no manual correlation. Radiologist attention stays on the image throughout the review.
Slice-Level Activation in 3D Series
For CT and MRI volumetric series, Grad-CAM activation is computed per axial slice rather than projected onto a single representative frame. Clinicians scroll through the series and track how heatmap intensity shifts across adjacent slices — essential for volumetric lesion assessment and haemorrhage extent mapping.
Prior-Study Heatmap Comparison
When a prior study exists in the system, Fractify overlays current and historical Grad-CAM activation maps side-by-side. This enables direct visual assessment of lesion progression, regression, or new pathology emergence — a capability that text-only AI outputs cannot replicate.
Does Heatmap Fidelity Always Justify the Compute Cost?
Honestly, I haven't seen enough data to say definitively whether high-resolution Grad-CAM heatmaps computed at full DICOM spatial resolution outperform lower-resolution approximations in every clinical context. The compute and latency cost of full-resolution Grad-CAM on a 512-slice CT series is non-trivial. At Fractify's current deployment configuration, heatmap generation adds approximately 340–480ms to total report generation time per series — acceptable for routine reads, worth monitoring under high-volume concurrent use in a busy emergency department.
My take: for critical findings — intracranial haemorrhage, tension pneumothorax, aortic dissection, acute stroke — full-resolution heatmap fidelity is non-negotiable. For low-acuity routine reads where the primary value is workflow triage rather than diagnostic localisation, a coarser 7×7 activation grid is clinically sufficient, and the latency saving compounds significantly when a department processes 200+ studies per shift.
The real question every clinical informatics lead should ask themselves before deploying AI explainability tools is this: have we trained our radiologists to interpret what an activation map actually represents, or are we assuming visual intuition will carry the weight?
There is one scenario where I would not recommend surfacing Grad-CAM heatmaps at all: when the clinical team is being onboarded to AI for the first time with no structured training in interpreting activation maps. A misread heatmap — where a clinician treats the activation gradient as a precise anatomical boundary rather than a probability field — can introduce a new category of cognitive error that is harder to detect than an AI false positive. Fractify recommends approximately two hours of structured onboarding training before routine heatmap use goes live. For first-time AI deployments, a phased approach — classification outputs and urgency scores in week one, heatmap overlays from week three onward — measurably reduces misinterpretation events. This is a deployment risk that most vendors do not discuss in sales evaluations, and that gap concerns me.
Research published in European Radiology examined radiologist performance with and without AI explainability overlays, finding that untrained users of heatmap tools performed no better than those receiving raw AI text outputs — and in some subcategories performed worse, with higher rates of fixation on heatmap-highlighted regions at the expense of peripheral findings. The explainability value of Grad-CAM is real, but it is contingent on the clinician's ability to interpret the explainability layer. That dependency should lead every hospital procurement conversation about AI radiology, not close it.
Radiologists who've integrated Fractify into their PACS workflow tell me the adoption curve consistently has two phases: the first two to three weeks, where heatmaps feel unfamiliar and clinicians are deliberately cautious about weighting them; and from week four onward, where the visual confirmation becomes a natural reading reflex and radiologists begin noticing when the heatmap localises to a region they hadn't consciously prioritised. That second phase is when AI genuinely augments clinical judgment rather than existing alongside it.
Procurement Questions That Actually Matter
Most procurement evaluations of AI radiology systems ask about accuracy benchmarks and PACS vendor compatibility. Those questions matter. The better questions are about heatmap delivery format (DICOM-native or browser-only), per-class versus composite overlays, slice-level activation in volumetric series, structured training requirements, and what the audit trail captures per alert. Fractify surfaces this information during clinical evaluation — the explainability layer is not a demonstration feature but the primary mechanism for earning institutional trust across every new hospital deployment.
What is a Grad-CAM heatmap and how does it work in AI radiology?
Grad-CAM (Gradient-weighted Class Activation Mapping) generates a spatial heatmap by computing the gradient signal from the final convolutional layer of a neural network. In radiology, it overlays colour-coded activation directly onto the DICOM image, showing which anatomical regions drove the AI classification decision — making the model's spatial reasoning directly visible to the reviewing clinician without requiring model architecture changes.
Does Fractify include Grad-CAM heatmaps in its AI radiology reports?
Yes. Fractify surfaces per-class Grad-CAM heatmaps across all four modalities: chest X-ray (18+ pathologies), CT brain (6 intracranial haemorrhage subtypes), MRI brain (97.9% tumour detection accuracy), and bone X-ray (97.7% fracture detection accuracy). Heatmaps are delivered as DICOM-native secondary objects within the PACS viewer — no application switching required for the radiologist.
How does Grad-CAM explainability improve clinician trust in AI diagnostic systems?
Grad-CAM allows clinicians to verify that the AI responded to correct anatomical features rather than imaging artefacts or incidental findings. Research in Radiology (RSNA) shows radiologists given heatmap overlays make fewer AI override errors than those receiving text-only outputs. Visual confirmation anchors clinical judgment directly to the AI's spatial reasoning, reducing both false rejection and uncritical acceptance of alerts.
Can Grad-CAM heatmaps be integrated into an existing PACS system?
Yes, when delivered as secondary DICOM objects — Fractify's standard approach. The heatmap attaches to the original DICOM series and surfaces inline within the existing PACS viewer, scrollable with the study. This eliminates the two primary friction points that cause radiologists to ignore AI overlays in practice: application switching and manual image correlation.
What is the spatial accuracy of Grad-CAM localisation in Fractify's AI models?
In Fractify's CT brain validation cohort, Grad-CAM heatmaps for intracranial haemorrhage localised the activation centroid within 8mm of the radiologist's region-of-interest annotation across 1,200 validation cases. For MRI brain tumour detection at 97.9% accuracy and chest X-ray pathologies across 18+ categories, heatmap localisation precision was assessed during structured clinical validation prior to production deployment.
Does adding Grad-CAM heatmaps increase AI radiology report generation time?
At Fractify's current configuration, heatmap generation adds approximately 340–480ms per series to total report generation time. Net radiologist reading time typically decreases because heatmaps direct attention to the region of interest faster than unguided visual search. The efficiency gain is most pronounced for subtle findings: non-displaced fractures, small haemorrhages, and early consolidation patterns at the lung bases.
Are there situations where Grad-CAM heatmaps should not be shown to clinicians?
Yes. Clinicians without structured training in interpreting activation maps can misread heatmap gradients as precise anatomical boundaries rather than probability fields, introducing new cognitive errors. For first-time AI deployments, a phased rollout — classification outputs first, heatmap overlays after two to three weeks of supervised use — measurably reduces misinterpretation events. Approximately two hours of onboarding training is the minimum recommended threshold before routine heatmap use.
How does Fractify's Grad-CAM support prior-study comparison in radiology workflows?
When a prior DICOM study exists in the system, Fractify overlays current and historical Grad-CAM activation maps side-by-side within the structured report view. Clinicians can visually assess changes in activation intensity and spatial distribution across time points — directly supporting lesion progression assessment, treatment response monitoring, and detection of new pathology emergence between serial studies.
See Fractify working on your own scans — live demo takes 15 minutes.
Request a Free Demo →