LG-AI-EXAONE/EXAONE-Path-2.5
Python
Captured source
source ↗LG-AI-EXAONE/EXAONE-Path-2.5
Language: Python
License: NOASSERTION
Stars: 5
Forks: 0
Open issues: 0
Created: 2025-12-15T09:37:18Z
Pushed: 2026-03-10T01:53:23Z
Default branch: main
Fork: no
Archived: no
README:
EXAONE Path 2.5
[`Github`] [`Hugging Face`] [`Paper`] [[Cite](#citation)]
Introduction
EXAONE Path 2.5 is a biologically informed multimodal framework that enriches histopathology representations by aligning whole-slide images with *genomic, epigenetic, and transcriptomic data*. By enabling all-pairwise cross-modal alignment across multiple layers of tumor biology, the model captures coherent genotype-to-phenotype relationships within a unified embedding space. This domain-informed design improves resource efficiency, enabling the model to achieve competitive performance across diverse tasks while using substantially fewer training samples and parameters than existing approaches.
Quickstart
Load EXAONE Path 2.5 and extract features.
1. Hardware Requirements ###
- NVIDIA GPU with 12GB+ VRAM
- NVIDIA driver version >= 525.60.13 required
Note: This implementation requires NVIDIA GPU and drivers. The provided environment setup specifically uses CUDA-enabled PyTorch, making NVIDIA GPU mandatory for running the model.
2. Environment Setup ###
First, install Micromamba if you haven't already. You can find installation instructions here. Then create and activate the environment using the provided configuration:
git clone https://github.com/LG-AI-EXAONE/EXAONE-Path-2.5.git cd EXAONE-Path-2.5 micromamba create -n exaonepath python=3.12 micromamba activate exaonepath pip install -r requirements.txt
3. Inference Workflow Overview
EXAONE Path 2.5 inference follows a two-stage pipeline. (1) Patch-level feature extraction: extract pretrained patch embeddings from either image patches or full WSIs. (2) Slide-level feature extraction: aggregate patch embeddings into slide representations aligned with genomics data. Sections 3.1 and 3.2 describe these steps in detail.
3.1. Patch Feature Extraction
You can extract the pretrained patch features (without multimodal alignment) in two ways.
- 3.1.1 (Tensor output): for rapid prototyping or custom pipelines
- 3.1.2 (HDF5 file output): for full WSI processing, visualization, and downstream slide encoding
##### 3.1.1. Tensor output Assuming you have an image, you can run the following code snippet to extract pretrained patch features.
import torch
from PIL import Image
from torchvision import transforms
from transformers import AutoModel
repo_id = "LGAI-EXAONE/EXAONE-Path-2.5"
device = "cuda"
# Input
png_path = "path/to/your/sample_patch.png"
# Load patch encoder
patch_encoder_model = AutoModel.from_pretrained(
repo_id,
component="patch",
trust_remote_code=True,
).to(device).eval()
# Image preprocessing (must match patch encoder training)
transform = transforms.Compose([
transforms.Resize(224),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
img = Image.open(png_path).convert("RGB")
image_tensor = transform(img).unsqueeze(0).to(device) # [1, 3, 224, 224]
with torch.no_grad():
patch_encoder_embedding = patch_encoder_model(image_tensor) # [B=1, C]Outputs
patch_encoder_embedding: a tensor of shape[B=1, C]whereBis the batch size, andCis the embedding dimension- This tensor can be passed directly to the slide encoder in Section 3.2.
##### 3.1.2. Full WSI patch-feature pipeline (HDF5 Output) The step is further broken into smaller steps.
(1) Generate patch coordinates (and contour indices) with Python function API patchfy that you can import and call directly to:
- segment tissue regions
- extract patch coordinates
- (optionally) write a HDF5 file with
coords+contour_index
from exaonepath.patches import patchfy wsi_path = "path/to/your/slide.svs" # .svs/.tif/.tiff/.ndpi/.mrxs/... out_dir = "path/to/output_dir" h5_path, coords, contour_idx = patchfy( wsi=wsi_path, out=out_dir, patch_size=256, step_size=256, patch_level=0, save_h5=True, save_mask=True, auto_skip=True, )
Outputs
- If
save_h5=True, the patches are saved to: /patches/.h5- If
save_mask=True, a segmentation visualization is saved to: /masks/.jpg- If the slide is skipped due to the segmentation safety cap, a reason is written to:
/skipped/.txt
Note: the effective patch_size/step_size written to the HDF5 may be MPP-normalized internally (see the patchfy docstring for details).
The returned arrays are:
coords:N x 2int array of patch coordinates(x, y)in level-0 pixel space.contour_idx: int array of lengthNholding the tissue contour index of each patch.
Useful parameters
seg_downsample(float): extra downsampling factor for segmentation only (speed vs accuracy).max_seg_pixels(float): skip very large slides at the chosen segmentation level (set `/patches/.h5" \
--out_h5_path "/patches/_features.h5" \ --batch_size_per_gpu 32
##### Notes - `coords_h5_path` must be the H5 produced by `patchfy` (`save_h5=True`). Future slide encoder requires `coords`, ideally with `contour_index`. - The output file (`out_h5_path`) will contain: `features` [N, C], `coords` [N, 2], `contour_index` [N]. #### 3.2. Slide Feature Extraction Patch features, coordinates, (contour index) must be available. Use the below code snippet if patch feature extraction was conducted with 3.1.2. `patch_features_h5_path` should be identical as `out_h5_path` from the previous step.
import h5py import torch from transformers import AutoModel
device = "cuda"
Load slide encoder (HF)
repo_id = "LGAI-EXAONE/EXAONE-Path-2.5" slide_encoder = AutoModel.from_pretrained( repo_id, component="slide", trust_remote_code=True, ).to(device).eval()
Load patch-level features exported as an HDF5 file
Expected keys: features [N, C], coords [N, 2], contour_index [N]
patch_features_h5_path = "/patches/_features.h5" with h5py.File(patch_features_h5_path, "r") as f: patch_features = torch.from_numpy(f["features"][:]).float() # [N, C] patch_coords = torch.from_numpy(f["coords"][:]).long()…
Excerpt shown — open the source for the full document.
Notability
notability 2.0/10Low stars, routine repo