Tools & Data
Open-source infrastructure for multimodal cancer AI.
View on GitHubAll tools developed under the MEFINDER initiative are released with academic open-source licenses and designed for reproducible research.
01
Multimodal Fusion Models
MOSCARD
MICCAI 2025
Addresses bias in multimodal medical imaging by integrating causal reasoning.
Uses chest X-ray (CXR) as primary modality and ECG as complementary guiding modality. Employs a co-attention mechanism and Vision Transformer (ViT)/MedCLIP encoder. Includes a structural causal model (SCM) for de-confounding. Supports four training modes: Baseline, Causal, Conf, and CaConf. Code available in repository.
Availability:
GitHubVLM for Mammography
MICCAI 2024
Knowledge-grounded adaptation strategy for vision-language models for screening mammography.
Builds unique case-sets for screening mammography using mini-batch selective sampling for VLM adaptation. Evaluated with two VLMs: MedCLIP (in-domain) and ALBEF (out-of-domain). Validated zero-shot, few-shot, and supervised on UW Madison datasets and externally on Mayo Clinic. Authors include Aisha Urooj Khan et al. Model checkpoints available via download link.
Availability:
GitHub; model checkpoints via download linkMM-STGNN
IEEE Journal of Biomedical and Health Informatics, 2023
Multimodal spatiotemporal graph neural network for 30-day all-cause hospital readmission prediction.
Fuses longitudinal chest radiographs and EHR data using a GraphSAGE + GRU architecture. Achieved AUROC 0.79 on both evaluation datasets (MIMIC-IV). Code available in repository.
Performance — AUROC 0.79 (MIMIC-IV)
Availability:
GitHub02
NLP & Clinical Text
BreastRecurrence_Transformer
Transformer-based NLP for identification of breast cancer recurrence occurrence and timing from EMRs.
Adaptable to other cancer sites. Validated on Mayo, Stanford, Emory, and UC Davis. Released with an academic open-source license and packaged in Docker. Model weights available via Google Drive.
Availability:
Google Drive (model weights); DockerBreast Cancer Treatment Extraction
Hybrid UMLS parser + fine-tuned LLM for extracting longitudinal treatment timelines from free-text clinical notes.
Combines a UMLS-based parser with fine-tuned language models (GPT-2, BioGPT, LLaMA) to extract structured treatment timelines from unstructured clinical notes. Validated on Mayo, Stanford, Emory, and UC Davis. Released with an academic open-source license and packaged in Docker.
Availability:
GitHub; DockerPCO Extraction
Fine-tuning framework for LLMs to extract patient-centered outcomes from breast cancer clinical notes.
Extracts treatment-related side effects including fatigue, depression, anxiety, nausea, and lymphedema from breast cancer clinical notes. Validated on Mayo, Stanford, Emory, and UC Davis. Released with an academic open-source license and packaged in Docker.
Availability:
GitHub; DockerRecurrence Site Extraction (BioLinkBERT)
Fine-tuned BioLinkBERT model for extracting sites of distant recurrence from clinical, radiology, and pathology notes.
Fine-tuned on annotated clinical, radiology, and pathology notes to identify distant recurrence sites. Validated on Mayo, Stanford, Emory, and UC Davis. Released with an academic open-source license and packaged in Docker.
Availability:
GitHub; Docker03
Data Quality & Infrastructure
Mammogram Implant Identifier
ResNet18 CNN that identifies breast implants in mammograms without relying on DICOM tags.
Trained on 6,250 mammograms (5,000 train/validate, 1,250 test). Does not rely on DICOM metadata tags. Model weights available in repository.
Performance — AUROC 0.998 · Sensitivity 0.966 · Specificity 1.000
Availability:
GitHub (model weights in repository)Niffler
Journal of Digital Imaging (JDI), 2021
DICOM framework for machine learning pipelines enabling real-time and on-demand DICOM retrieval from PACS.
Enables real-time and on-demand DICOM retrieval from PACS, metadata extraction, anonymization, and processing workflows. Published in the Journal of Digital Imaging (JDI), 2021.
Availability:
github.com/Emory-HITI/NifflerHITI-Preproc
Python package for DICOM preprocessing workflows.
Provides standardized DICOM preprocessing utilities installable directly via PyPI.
Availability:
PyPIpip install hiti-preprocRadPrompter
Tool for simplified and reproducible LLM prompting for structured radiology reporting and dataset relabeling.
Provides a reproducible interface for prompting large language models to generate structured radiology reports and relabel datasets. Installable via PyPI.
Availability:
PyPIpip install radprompter04
Prostate Cancer Tools
Prostate Cancer Lesion Detection (nnU-Net)
nnU-Net-based framework for prostate cancer lesion detection.
Developed at Indiana University (Shiradkar Lab). Trained on the PI-CAI and Prostate-158 datasets. Model weights available on request from the authors.
Availability:
Request from authors (Indiana University, Shiradkar Lab)PI-QUAL vs MRQy Comparison
Code and analysis for prostate MRI quality assessment comparing PI-QUAL and MRQy quality metrics.
Compares PI-QUAL and MRQy quality metrics for prostate MRI. Uses UMAP clustering, V-Net segmentation, and VoxelMorph deformable registration for analysis.
Availability:
GitHub