San Francisco, CA, March 06, 2025 –(PR.com)– HistAI Announces Launch of the SPIDER Initiative, the Largest Open-Source Supervised Pathology Dataset
At the HIMSS Conference, HistAI introduced the SPIDER Initiative (Supervised Pathology Image-DEscription Repository), an open-source project aimed at advancing computational pathology through the development of a large-scale, supervised pathology dataset. Over the next year, the initiative will compile more than 50 million fully supervised pathology images, spanning 20 different organs and tissue types and 400 distinct morphologies. The dataset is intended to support global research efforts in clinical diagnostics, precision medicine, and drug development by providing researchers with access to high-quality, annotated pathology images.
Dataset Release: More than 4 Million Patches Covering 52 Morphologies
As an initial milestone, HistAI is releasing a dataset of more than 4 million manually annotated image patches derived from nearly 6,000 Whole Slide Images (WSIs). This dataset includes pathology data from three human tissue types — skin, colorectal, and thorax — and is accompanied by three AI patch-classification models capable of identifying 52 distinct morphologies. The dataset has been curated to ensure quality and consistency, providing a resource for researchers working in diagnostic development, AI model training, and drug discovery.
Future Expansion Plans
HistAI plans to expand the SPIDER repository further in the coming months. In two weeks, an additional dataset and AI model for breast tissue, covering 25 morphologies, will be released. This expansion is part of the broader initiative to cover 20 organs and tissue types, with the goal of enhancing resources available to the research community.
Collaborative Research and Accessibility
The SPIDER Initiative is designed to make high-quality supervised pathology data widely accessible. By offering open access to datasets and AI models, HistAI aims to support collaboration among researchers, clinicians, and technologists working in pathology and related fields.
“By making these datasets openly available, we hope to contribute to advancements in AI-driven pathology research,” said Dmitry Nechaev, Chief AI Scientist at HistAI. “Access to diverse, high-quality datasets can improve the accuracy and efficiency of AI models, ultimately aiding in diagnostic improvements and drug development.”
HistAI invites researchers to explore the released dataset, contribute to the repository, and participate in the initiative. More information on accessing the dataset and collaboration opportunities is available on the company’s website.
About HistAI
HistAI is a computational pathology company specializing in the use of large datasets and artificial intelligence to support medical research, diagnostics, biomarker discovery, and drug development. Through the development of AI models and extensive datasets, HistAI aims to advance precision medicine and healthcare solutions.