Artificial intelligence in the age of cognitive endoscopy
GI endoscopy has evolved from a simple diagnostic tool to an entire field incorporating highly complex technical and cognitive skills to perform an ever-increasing number of tasks. Key to high-quality endoscopy is the acquisition of images that contain useful information. For any given procedure, the endoscopic video stream contains thousands of single-frame images, most of which lack clinically relevant information. Thus, the endoscopist must possess the technical skill to obtain meaningful images and to sort through these images (at an subconscious level) to extract relevant information by recognizing and interpreting features and patterns so as to ultimately use this information to make an appropriate clinical decision. Given the wide range of GI manifestations of disease, this can be a daunting task.
The endoscopic evaluation of Barrett’s esophagus (BE) is a particularly salient example of the interplay between cognitive and technical endoscopy. Upon encountering columnar mucosa in the distal esophagus, the endoscopist must quantify the configuration and extent of the abnormal mucosa before engaging in a detailed forward and retroflexed examination to detect subtle mucosal irregularities concerning for early neoplasia. If such lesions are detected, the endoscopist must delineate the extent of the lesion, assess the morphology, and decide whether to perform endoscopic resection. These steps are performed amid esophageal peristalsis and the need to constantly clear secretions from the esophageal lumen. This process requires both technical and cognitive proficiency. In this context, a skilled endoscopist can potentially miss a lesion if he or she fails to recognize features associated with early neoplasia. Given the low rate of progression of BE to esophageal adenocarcinoma, it is not unexpected that most endoscopists will have a low exposure to early neoplasia in BE throughout their career.1 In fact, studies have shown that lesions are more likely to be recognized in centers with expertise in the diagnosis and management of BE.2 Furthermore, longer esophageal inspection times are associated with a higher rate of dysplasia detection, suggesting that the endoscopist will continue to assimilate data until a certain cognitive threshold is reached for diagnostic suspicion.3
How can we directly transfer the expertise shared by a few into the hands and minds of the many and do so in a meaningful way? In recent years, the application of computer vision and artificial intelligence (AI) techniques hold the promise to do just that: to enhance the effectiveness and efficiency of endoscopy. Deep-learning algorithms are capable of feature extraction and pattern recognition across large datasets and can be implemented to provide real-time feedback. The development of these algorithms requires careful selection and annotation of a large image library that serves as the “ground truth” against which performance is tested. Under the premise of supervised learning, this annotated image library is used to transfer expert knowledge into the algorithm. The algorithm is trained to first recognize simple features (lines, curves) and builds upon them with escalating complexity. Importantly, the features recognized by the algorithm may not match those recognized by humans as pertinent for a given diagnosis. In some cases, this is seen as an advantage because it may lead to the discovery of new patterns with improved performance. However, the exact features extracted by the algorithm are often not known or uninterpretable the so-called “black box” of AI.
Can AI augment technical and cognitive skills to enhance endoscopic quality? Several such machine-learning algorithms have been designed and tested in the performance of unique endoscopic tasks. Some algorithms provide automated quality control through real-time feedback on established metrics. Recent publications have shown the potential impact of these algorithms on the quality of colonoscopy and EGD.4 ,5 The clinical implementation of these algorithms may ultimately lead to improvement in technical skills by incorporating real-time feedback into practice. The development of algorithms designed to enhance diagnostic performance has generated even greater interest in endoscopic AI implementation. Algorithms for colorectal polyp diagnosis and characterization have led to improve adenoma detection rates, as was demonstrated by a recent randomized study.6 How these algorithms will be integrated into current practice and whether they may have a negative impact on endoscopists’ cognitive skills remain important topics of future research.
In this issue of Gastrointestinal Endoscopy, de Groof et al7 describe the clinical implementation of a deep learning algorithm to detect early neoplasia in BE. The authors are commended for designing a high-quality algorithm, as described in a previous publication.8 The algorithm was first trained on a large dataset of luminal endoscopy images (close to half a million) and further refined by the use of multiple datasets of BE images with and without early neoplasia. Neoplastic cases were selected purely on histopathology and image quality, providing a dataset representative of feature variability associated with this finding. Clinically relevant expertise was incorporated into the algorithm by careful annotation of images and delineation of lesions by endoscopists experienced in the diagnosis and management of BE. Finally, the algorithm was validated by the use of 2 unique external image datasets that used 1 image per patient (160 patients) to prevent overfitting. This model used machine learning techniques of transfer learning and ensemble learning for image segmentation and classification.
The described algorithm is designed to interpret single-frame images, although real-time video interpretation is under active development. The proposed workflow is to obtain 3 sequential images every 2 cm across the BE segment with the gastroscope in a central position. The images are automatically transferred to a standalone computer system that runs the AI algorithm and provides a classification score (0%–100%) for neoplasia, with an optimal diagnostic threshold set at >60%. If the image is classified as neoplastic, the algorithm outlines the area suggestive of neoplasia and indicates the preferred site for biopsy. The diagnostic performance of the system, as tested in this pilot study of 10 patients with nondysplastic BE and 10 patients with BE neoplasia, is impressive (accuracy 90%, sensitivity 91%, and specificity 89%) and is further improved by high-confidence predictions (classification score of <20% or >80%). The authors further describe how false-positive results can be reduced when ≥2 images per level show concordant results. These findings suggest that the algorithm is ready for large-scale testing.
The clinical implementation of this algorithm would be analogous to an expert opinion at the click of a button and would likely enhance the diagnosis of early neoplasia in BE, particularly in endoscopy practices that do not often encounter this condition. With the pace of AI development in endoscopy and studies such as this, there is reason for great hope and excitement for the near future. Recall, however, that the endoscopic evaluation of BE entails more than image interpretation and relies on an intricate interplay between technical and cognitive skills. Although safety mechanisms are incorporated into this algorithm to ensure quality of interpretation, they do not obviate the need for a high-quality endoscopic examination. As AI in endoscopy continues to evolve and inch closer to clinical implementation, we should be ever mindful that the outcome of an endoscopic examination will always depend on the endoscopist holding the instrument.