Computer-Aided Quality for Endoscopic AI

The Challenge

Computer-Aided Detection and Diagnosis (CADe/x) systems are rapidly transforming gastrointestinal endoscopy by helping clinicians detect early-stage abnormalities like Barrett's esophagus neoplasia. However, these deep learning models are typically trained on pristine, high-quality, curated datasets. When deployed in real-world clinical environments, model performance often degrades due to sub-optimal imaging conditions such as motion blur, poor lighting, inadequate mucosal cleaning, or collapsed tissue. While standalone Computer-Aided Quality (CADq) preprocessing modules can filter out these unviable frames , running a completely separate deep learning feature extractor alongside an active CADe model introduces a massive computational bottleneck, rendering real-time, zero-latency execution during live procedures nearly impossible.

The Solution

To eliminate this operational friction, Theta Vision collaborated to develop a lightweight, real-time Computer-Aided Quality (CADq) gateway. Instead of introducing an entirely new, parameter-heavy architecture, this system directly repurposes and reuses the multi-level feature representations from a pre-trained, frozen CADe backbone (specifically, a CaFormer-S18 model). By serving as an intelligent, lightweight preprocessing gate, the CADq module analyzes endoscopic imagery in real time, determining if a frame meets strict diagnostic viability criteria before passing it downstream. This shared-backbone approach unlocks substantial computational savings and ensures high-speed inference, seamlessly integrating into clinical workflows to maintain CADe reliability without requiring extra hardware overhead.

Real-Time Quality Assurance Pipeline

The framework evaluates five distinct dimensions of image and domain integrity simultaneously, utilizing a multi-layer perceptron (MLP) multi-head decoder network built directly on top of the shared feature layers:

  • Overall Image Quality (OIQ) Gate: Evaluates the fundamental visual clarity, sharpness, and illumination of the frame. It automatically flags or rejects severely blurred or poorly lit inputs to eliminate downstream label noise and false predictions.

  • Mucosal Cleanliness Evaluator: Quantifies the presence of obscuring factors like bubbles, mucus, or fluid debris on a three-level ordinal scale (poor, adequate, good). This ensures that hidden mucosal details are not overlooked due to poor preparation.

  • Luminal Expansion Tracker: Measures the openness of the esophageal lumen. By ensuring the esophagus is properly distended rather than collapsed, it prevents the CADe system from drawing erroneous conclusions on uninterpretable tissue folds.

  • Procedural Orientation Classifier: Automatically detects the scope's positioning, distinguishing between a standard forward insertion view and a retrograde (retroflexed) view where the scope flips back to look at itself.

  • Feature-Based Out-of-Distribution (OOD) Guard: Operates directly within the deep feature space using Mahalanobis distance metrics to calculate an instantaneous anomaly score. It systematically identifies and flags out-of-domain inputs (accidental transitions into the stomach or non-medical artifacts) protecting the core CADe model from out-of-distribution confusion.

Impact

This shared-backbone CADq architecture transforms real-world endoscopic data assessment from a high-latency liability into a streamlined, edge-deployable asset. Extensive evaluation across 6,276 annotated esophageal images proved that the frozen-backbone configuration achieves high diagnostic accuracy across all evaluation tasks, matching the performance of a fully fine-tuned model while drastically reducing memory overhead and eliminating the need for additional trainable parameters. The feature-based Mahalanobis OOD detector demonstrated powerful resilience to domain shifts, successfully isolating non-esophageal anatomical regions. Ultimately, this framework provides endoscopists with instantaneous, clinically meaningful feedback on image quality, improving clinician performance, insulating medical AI systems against real-world data degradation, and establishing a new standard for efficient, trustworthy clinical integration.

Previous
Previous

Dataset Provider

Next
Next

Purple Nectar