🚀 Cluster and Production Deployment¶

This page provides guidelines for deploying MMORE in a shared cluster environment and, if needed, adapting it for production deployment.

The examples below assume a Run:ai-based cluster setup with a shared persistent volume. You may need to adapt paths, scheduler options, and environment variables to match your infrastructure.

🐳 Docker Image Requirements¶

Important

Build your own Docker image with your specific user ID and group ID to avoid permission issues in the production environment.

1. Check your user and group IDs on the cluster¶

id -u  # Your user ID
id -g  # Your group ID

2. Build Docker image with custom IDs¶

Choose one of the two options below:

Option A — CI build (recommended): Trigger the Build Student Image workflow manually from the GitHub Actions tab (Run workflow) and input your user UID and group GID. This builds a custom student GPU image published to GHCR, tagged as:

ghcr.io/swiss-ai/mmore:student-uid<user-id>-gid<group-id>-gpu

You can then use this image reference directly in the Run:ai commands below.

Option B — local build (replace <user-id> and <group-id> with your actual IDs):

sudo docker build -f Dockerfile --build-arg USER_UID=<user-id> --build-arg USER_GID=<group-id> -t mmore .

Login to DockerHub (option B only):

docker login docker.io

Push the image to the registry (option B only):
Replace username with your DockerHub username.

docker tag mmore docker.io/<username>/mmore:latest
docker push docker.io/<username>/mmore:latest

3. Identify your image reference¶

All runai commands below use <image> as a placeholder. Replace it with:

Option A: ghcr.io/swiss-ai/mmore:student-uid<user-id>-gid<group-id>-gpu
Option B: docker.io/<username>/mmore:latest

For detailed installation instructions, see Installation.

🖥️ Running on a cluster¶

Environment setup¶

First, define the main environment variables used for input, output, and cache data.

The examples below use the RCP path convention based on $GASPAR. Adapt these paths if your cluster uses a different shared storage layout.

export ROOT_OUT_DIR=/lightscratch/users/$GASPAR/mmore-data/out
export ROOT_IN_DIR=/lightscratch/users/$GASPAR/mmore-data/in
export XDG_CACHE_HOME=/lightscratch/users/$GASPAR/.cache
export HF_HOME=/lightscratch/users/$GASPAR/.cache/huggingface
export TORCH_HOME=/lightscratch/users/$GASPAR/.cache/torch

Directory structure initialization¶

Create the required directory structure on the shared storage volume:

mkdir -p $ROOT_IN_DIR
mkdir -p $ROOT_OUT_DIR
mkdir -p $ROOT_OUT_DIR/db
mkdir -p $ROOT_OUT_DIR/process/outputs/images
mkdir -p $ROOT_IN_DIR/sample_data
mkdir -p $XDG_CACHE_HOME
mkdir -p $HF_HOME
mkdir -p $TORCH_HOME

💻 Interactive Development Session¶

For development, debugging, or manual operations, you can start an interactive session on the cluster.

The example below assumes a Run:ai-based environment. Replace <group-id> with your actual group ID.

runai submit swissaimmore \
  --image <image> \
  --node-pool h100 \
  --pvc light-scratch:/lightscratch \
  --gpu 1 \
  --run-as-gid <group-id> \
  --preemptible \
  --attach \
  --interactive \
  --tty \
  --command /bin/bash

This provides a direct terminal access to the container.

⚙️ Production Pipeline Execution¶

The following examples show how to submit MMORE pipeline stages as cluster jobs in a Run:ai-based environment.
Adapt resource settings, storage mounts, paths, and scheduler flags to your infrastructure.

1. Document Processing¶

Process raw documents and extract multimodal content.
Replace <group-id> with your actual group ID.

runai submit \
  --name swissaimmore-process \
  --image <image> \
  --backoff-limit 0 \
  --pvc light-scratch:/lightscratch \
  --run-as-gid <group-id> \
  --node-pool h100 \
  --gpu 1 \
  -e ROOT_IN_DIR=$ROOT_IN_DIR \
  -e ROOT_OUT_DIR=$ROOT_OUT_DIR \
  -e XDG_CACHE_HOME=$XDG_CACHE_HOME \
  -e HF_HOME=$HF_HOME \
  -e TORCH_HOME=$TORCH_HOME \
  --command "python3 -m mmore process --config-file production-config/process/config.yaml"

2. Post-processing¶

Clean and structure the extracted data.

runai submit \
  --name swissaimmore-postprocess \
  --image <image> \
  --backoff-limit 0 \
  --pvc light-scratch:/lightscratch \
  --run-as-gid <group-id> \
  --node-pool h100 \
  --gpu 1 \
  -e ROOT_IN_DIR=$ROOT_IN_DIR \
  -e ROOT_OUT_DIR=$ROOT_OUT_DIR \
  -e XDG_CACHE_HOME=$XDG_CACHE_HOME \
  -e HF_HOME=$HF_HOME \
  -e TORCH_HOME=$TORCH_HOME \
  --command "python3 -m mmore postprocess --config-file production-config/postprocessor/config.yaml --input-data $ROOT_OUT_DIR/process/outputs/merged/merged_results.jsonl"

3. Vector indexing¶

Create searchable vector indexes.

runai submit \
  --name swissaimmore-index \
  --image <image> \
  --backoff-limit 0 \
  --pvc light-scratch:/lightscratch \
  --run-as-gid <group-id> \
  --node-pool h100 \
  --gpu 1 \
  -e ROOT_IN_DIR=$ROOT_IN_DIR \
  -e ROOT_OUT_DIR=$ROOT_OUT_DIR \
  -e XDG_CACHE_HOME=$XDG_CACHE_HOME \
  -e HF_HOME=$HF_HOME \
  -e TORCH_HOME=$TORCH_HOME \
  --command "python3 -m mmore index --config-file production-config/index/config.yaml --documents-path $ROOT_OUT_DIR/postprocessor/outputs/merged/final_pp.jsonl"

4. RAG Service Deployment¶

Deploy the retrieval API service.

runai submit \
  --name swissaimmore-rag \
  --image <image> \
  --backoff-limit 0 \
  --pvc light-scratch:/lightscratch \
  --run-as-gid <group-id> \
  --node-pool h100 \
  --gpu 1 \
  -e ROOT_IN_DIR=$ROOT_IN_DIR \
  -e ROOT_OUT_DIR=$ROOT_OUT_DIR \
  -e HF_TOKEN=$HF_TOKEN \
  -e XDG_CACHE_HOME=$XDG_CACHE_HOME \
  -e HF_HOME=$HF_HOME \
  -e TORCH_HOME=$TORCH_HOME \
  --command "python3 -m mmore live-retrieval --config-file production-config/retriever_api/config.yaml"

Port-forwarding¶

To access the service locally, use:

runai port-forward swissaimmore-rag 8080:8080

🚀 Cluster and Production Deployment¶

🐳 Docker Image Requirements¶

1. Check your user and group IDs on the cluster¶

2. Build Docker image with custom IDs¶

3. Identify your image reference¶

🖥️ Running on a cluster¶

Environment setup¶

Directory structure initialization¶

💻 Interactive Development Session¶

⚙️ Production Pipeline Execution¶

1. Document Processing¶

2. Post-processing¶

3. Vector indexing¶

4. RAG Service Deployment¶

Port-forwarding¶

See also¶