Abstract
Foundation models (FMs) are revolutionizing medical imaging by transitioning from task-specific algorithms to large-scale , generalizable systems that can learn from a broad range of multimodal data. Recent advances in these fields—transformer-based visual encoders , promptable segmentation architectures , vision–language models , and parameter-efficient fine-tuning—have resulted in improved performance among segmentation , detection , classification and report generation techniques in a variety of modalities such as MRI , CT , ultrasound , X-ray , endoscopy , and digital pathology. Domain specific FMs (including prostate MRI, brain MRI , retinal , ultrasound and pathology models) have proved to be effective in providing high label efficiency and competitive or better performance with the mainstream deep learning models , in particular under low-annotation conditions. Trends in the research emphasize such techniques as large-scale pretraining, multimodal integration , cross-task generalization , data-efficient learning , and the development of universal feature encoders. Simultaneously , extensive benchmarking and external validation indicate performance variability , motivating the continued development of standardized evaluation protocols. Adoption by clinical practice has been restricted because of interpretability , bias, workflow integration, computational requirements , and regulatory uncertainty. New options such as personalizable AI , continual learning , federated model adaptation , and imaging–genomics integration , stand out to make FMs key for the future of precision medicine. This article consolidates architectural , pioneering foundation models , clinical evaluation , and translational advancements , drawing upon the current context and future direction of foundation-model medical imaging.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Copyright (c) 2025 Iraqi Journal of Intelligent Computing and Informatics (IJICI)