πŸš€ QuickstartΒΆ

OverviewΒΆ

This page helps you get MMORE running quickly with a minimal workflow.

The goal is not to cover every configuration option, but to give you a first successful setup and a clear mental model of the main steps.

What this quickstart coversΒΆ

In a typical MMORE workflow, you will:

  1. install the project and its dependencies

  2. prepare a small document collection

  3. process the collection

  4. build an index

  5. run retrieval or a simple RAG workflow

Before you startΒΆ

Make sure you have already read Installation.

You should also confirm that:

  • your environment is activated

  • project dependencies are installed

  • you are working on a small test collection first

Minimal workflowΒΆ

The exact commands depend on your repository entry points, but the overall workflow is the following.

1. Prepare a small collectionΒΆ

Start with a small and simple document set before moving to large-scale or distributed workloads.

For example, create a folder containing a few representative documents:

sample_data/
β”œβ”€β”€ doc1.pdf
β”œβ”€β”€ doc2.pdf
β”œβ”€β”€ doc3.html
└── doc4.md

2. Run document processingΒΆ

Processing transforms raw documents into a form that MMORE can index and retrieve from.

Depending on your setup, this step may include:

  • parsing files

  • extracting text and metadata

  • chunking content

  • preparing multimodal representations

See Processing pipeline for the detailed logic.

3. Build an indexΒΆ

Once documents are processed, create an index so they can be searched efficiently.

This step usually includes:

  • selecting the indexing backend or strategy

  • generating representations for chunks or documents

  • storing the resulting index artifacts

See Indexing for the full indexing workflow.

4. Run retrievalΒΆ

After indexing, you can test retrieval on a few example queries.

At this stage, you want to verify simple things:

  • does the system return relevant documents?

  • are the retrieved chunks meaningful?

  • is the ranking roughly coherent?

5. Move to RAG if neededΒΆ

If your workflow includes generation, retrieval results can then be passed into a RAG pipeline.

See RAG for how retrieval and generation are combined.

Example end-to-end flowΒΆ

Conceptually, a first MMORE run looks like this:

Raw documents
    ↓
Processing
    ↓
Structured outputs / chunks / metadata
    ↓
Indexing
    ↓
Retrieval
    ↓
Optional RAG generation

Common mistakesΒΆ

Warning

Do not start with a large or noisy collection.

When debugging a documentation-backed pipeline, a very small dataset is much easier to inspect and validate.

Typical first-run problems include:

  • wrong environment or missing dependencies

  • input paths that do not point to the expected collection

  • outputs written to a different directory than expected

  • indexing performed on incomplete processed data

  • retrieval tested before the index is fully built

Where to go nextΒΆ

After this page, the best next steps are:

  1. Architecture to understand the big picture

  2. Processing pipeline for ingestion and transformations

  3. Indexing for indexing details

  4. RAG for retrieval-augmented generation