π» Developer DocumentationΒΆ
Welcome to the MMORE developer documentation!
This guide will help you set up your development environment and contribute to the project.
Table of ContentsΒΆ
π οΈ Development setupΒΆ
System dependenciesΒΆ
Before installing MMORE for development, ensure you have the required system dependencies installed.
Linux (Ubuntu/Debian)ΒΆ
sudo apt update
sudo apt install -y ffmpeg libsm6 libxext6 chromium-browser libnss3 \
libgconf-2-4 libxi6 libxrandr2 libxcomposite1 libxcursor1 libxdamage1 \
libxext6 libxfixes3 libxrender1 libasound2 libatk1.0-0 libgtk-3-0 libreoffice \
libpango-1.0-0 libpangoft2-1.0-0 weasyprint
Note
On Ubuntu 24.04, replace libasound2 with libasound2t64.
You may also need to add the Ubuntu 20.04 focal repository to access some packages, for example by creating /etc/apt/sources.list.d/mmore.list with:
deb http://cz.archive.ubuntu.com/ubuntu focal main universe
macOSΒΆ
brew update
brew install ffmpeg chromium gtk+3 pango cairo \
gobject-introspection libffi pkg-config libx11 libxi \
libxrandr libxcomposite libxcursor libxdamage libxext \
libxrender libasound2 atk libreoffice weasyprint
If weasyprint fails to find GTK or Cairo, also run:
brew install cairo pango gdk-pixbuf libffi
uv pip install weasyprint
Installing MMORE for developmentΒΆ
1. Clone the repositoryΒΆ
git clone https://github.com/swiss-ai/mmore.git
cd mmore
2. Create a virtual environment and install dependenciesΒΆ
uv venv .venv
source .venv/bin/activate
uv pip install -e ".[all,cpu,dev]"
Note
For GPU (CUDA 12.6), replace cpu with cu126, for example:
uv pip install -e ".[all,cu126,dev]"
Note
For a partial install, replace all with only the stages you need, for example:
uv pip install -e ".[rag,cpu,dev]"
Available stages are: process, index, rag, and api.
Important
This package requires many large dependencies and a dependency override, so it should be installed with uv rather than plain pip.
See the uv guide for more information.
π§Ή Code quality toolsΒΆ
MMORE uses several tools to maintain code quality and consistency.
Pre-commit hooksΒΆ
We use pre-commit to automatically run code formatters and linters before each commit.
SetupΒΆ
1. Install pre-commitΒΆ
uv pip install pre-commit
2. Set up the git hook scriptsΒΆ
pre-commit install
3. Run the checks manuallyΒΆ
Optional but recommended before your first commit.
pre-commit run --all-files
Configured HooksΒΆ
The pre-commit configuration runs ruff, a code formatter for consistent style.
Type CheckingΒΆ
We use pyright for static type checking.
Please ensure your pull requests are type-checked before submission.
To run type checking manually:
pyright
π€ Contributing GuidelinesΒΆ
We welcome contributions! Hereβs how you can help:
Reporting IssuesΒΆ
Bug reports: open an issue with a clear description, steps to reproduce, and expected vs. actual behavior
Feature requests: open an issue describing the feature, its use case, and potential implementation approach
Check the Issues page for ongoing work
Code ContributionsΒΆ
Fork the repository and create a new branch for your feature/fix
Write clear, documented code following the existing style
Add tests if applicable
Ensure all pre-commit hooks pass
Run type checking with
pyrightSubmit a Pull Request with a clear description
ποΈ Project StructureΒΆ
mmore/ βββ mmore/ β βββ process/ # Document processing pipeline β β βββ processors/ # Individual file type processors β β βββ β¦ β βββ postprocess/ # Post-processing utilities β βββ index/ # Indexing and vector DB β βββ rag/ # RAG implementation β βββ type/ # Type definitions and data models βββ docs/ # Documentation βββ examples/ # Example configurations and data βββ tests/ # Test suite βββ .pre-commit-config.yaml βββ pyproject.toml βββ README.md
Key ModulesΒΆ
mmore.process: Handles extraction from various file formatsmmore.index: Manages hybrid dense+sparse indexing with Milvusmmore.rag: RAG system with LangChain integrationmmore.type: Core data structures likeMultimodalSample
π§ͺ TestingΒΆ
Running tests in the terminalΒΆ
pytest tests/
GPU testsΒΆ
Tests requiring a CUDA GPU are marked @pytest.mark.gpu and skipped by
default. Pass --gpu to run them:
pytest --gpu # full suite, including GPU tests
pytest --gpu -m gpu # only the GPU-marked tests
To mark a new GPU-only test:
import pytest
@pytest.mark.gpu
def test_something_on_gpu():
...
Writing testsΒΆ
Place tests in the
tests/directoryUse descriptive test names
Cover edge cases and error conditions
Mock external dependencies when appropriate
Mark GPU-only tests with
@pytest.mark.gpu(see above)
π Pull Request ProcessΒΆ
Update documentation if youβre adding new features
Add examples for new functionality
Ensure all tests pass and pre-commit hooks succeed
Update the changelog if applicable
Request review from maintainers
PR checklistΒΆ
[ ] Code follows project style guidelines
[ ] Pre-commit hooks pass (
pre-commit run --all-files)[ ] Type checking passes (
pyright)[ ] Tests are added or updated as needed
[ ] Documentation is updated
[ ] Examples are provided for new features
[ ] Commit messages are clear and descriptive
π‘ Development tipsΒΆ
Working with uvΒΆ
Use
uv pipinstead ofpipfor all package installationsThe project uses dependency overrides that are handled automatically by
uvSee the
uvtutorial for more details
β QuestionsΒΆ
If you have questions about contributing, feel free to:
Open a discussion on GitHub
Reach out to the maintainers
Check existing issues for similar questions
Thank you for contributing to MMORE! π