New modules: Llamacpp-python/run and huggingface/download for allowing to run simple text workloads with local LLMs by toniher · Pull Request #11053 · nf-core/modules

toniher · 2026-03-26T11:06:13Z

This pull request, contributed jointly with @lucacozzuto , provides a simple workload for running text inference tasks using llamacpp-python against local LLMs.
This effort was worked on during the nf-core Hackathon in March 2026.

PR checklist

Closes #XXX

… llamacpp

famosab

Thank you for your contribution to nf-core! We really appreciate it. I added a few comments to your PR.

We usually recommend to have one module per PR. That makes the review process easier and its more likely that someone will review your PR. You can keep that in mind for the next PRs.

toniher · 2026-04-04T16:39:40Z

Thank you for your contribution to nf-core! We really appreciate it. I added a few comments to your PR.

We usually recommend to have one module per PR. That makes the review process easier and its more likely that someone will review your PR. You can keep that in mind for the next PRs.

Hi @famosab . Thanks for the feedback and I will go through your comments! I was told about the module submission, but since you need the output of one of the processes for the other, I thought it would help potential users once it could become eventually accepted. But, certainly, is more work for everyone. Sorry about this and I will avoid it in future PRs.

…elines/components/modules#naming-conventions

Not so many assertions for stub test Co-authored-by: Famke Bäuerle <45968370+famosab@users.noreply.github.com>

… llamacpp

output to ${prefix} Co-authored-by: Famke Bäuerle <45968370+famosab@users.noreply.github.com>

… llamacpp

famosab · 2026-04-13T07:50:32Z

I think from my side it looks good, but now I spend so much time back-and-forth with you and this module is a bit unusual :) so I would like to have another set of eyes on this PR! I will ask for another review :)

pinin4fjords

I really think we should avoid using module binaries until we're satisfied they don't limit module portability. Currently Wave is required for Cloud scenarios without a shared file system.

Templates are the more portable, if less pretty, approach.

pinin4fjords

I think there's a few things to resolve here since we're colouring outside the lines a bit.

pinin4fjords · 2026-04-13T08:55:44Z

+    label 'process_gpu'
+
+    conda "${moduleDir}/environment.yml"
+    container "${task.accelerator ? 'quay.io/nf-core/llama-cpp-python:0.1.9' : 'community.wave.seqera.io/library/llama-cpp-python:0.3.16--b351398cd0ea7fc5'}"


Are you sure you need this? A little bit of AI chat suggests that:

llama-cpp-python compiled with CUDA support does runtime GPU detection and falls back to CPU when no GPU is present

... which, if correct, would mean you could just supply the same container in either case and avoid the complexity.

You might be able to do a multi-stage thing like this to bring the container size down:

FROM nvidia/cuda:12.4.1-devel-ubuntu22.04 AS builder RUN apt-get update && apt-get install -y python3 python3-pip python3-dev RUN pip3 install --prefix=/install llama-cpp-python==0.3.16 \ --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu124 FROM nvidia/cuda:12.4.1-runtime-ubuntu22.04 RUN apt-get update && apt-get install -y --no-install-recommends python3 python3-pip \ && rm -rf /var/lib/apt/lists/* COPY --from=builder /install /usr/local

... so it's less overhead for the non-GPU case (untested).

I just tried again running Docker in my laptop (without GPU) using quay.io/nf-core/llama-cpp-python:0.1.9 and I cannot make it work.

│ RuntimeError: Failed to load shared library '/usr/local/lib/python3.10/dist-packages/llama_cpp/lib/libllama.so': libcuda.so.1: cannot open shared object file: No │ │ such file or directory

So, it looks that after building the Docker image with GPU support, you cannot run it without any GPU device. It's particularly annoying because there are different GPU types and CUDA versions. :(

pinin4fjords · 2026-04-13T09:00:13Z

+    label 'process_gpu'
+
+    conda "${moduleDir}/environment.yml"
+    container "${task.accelerator ? 'quay.io/nf-core/llama-cpp-python:0.1.9' : 'community.wave.seqera.io/library/llama-cpp-python:0.3.16--b351398cd0ea7fc5'}"


No singularity?

Related discussed here: #11053 (comment)

@pinin4fjords Added Singularity equivalent 9069b44 - What else would you miss so far considering the present restrictions?

Hi, any feedback? So far, the only weak point I see is the way to handle container GPU-enabled scenarios. Please let me know!

mashehu · 2026-04-13T09:12:17Z

since this tool is on conda-forge, you don't need to add a dockerfile, just use seqera containers https://nf-co.re/docs/developing/containers/seqera-containers

I am actually using seqera containers for no gpu situations.

https://seqera.io/containers/?packages=conda-forge::llama-cpp-python=0.3.16

I understand I could approach it similarly as here:

modules/modules/nf-core/multiqc/meta.yml

Line 110 in dd6396b

containers:

Until there is any better solution, I could enable when it is, at the same time containers, accelerator (and amd64?) to use quay.io/nf-core/llama-cpp-python:0.1.9 and for the rest of situations to do as with the multiqc module.
Otherwise, I could simply remove the gpu container until there must be a better idea. It must be said that the speed gain is huge with gpu...
Any feedback is appreciated!

toniher · 2026-04-13T10:27:47Z

I really think we should avoid using module binaries until we're satisfied they don't limit module portability. Currently Wave is required for Cloud scenarios without a shared file system.

Templates are the more portable, if less pretty, approach.

Any module example you could suggest as a model suitable for this case?

pinin4fjords · 2026-04-13T14:31:36Z

I really think we should avoid using module binaries until we're satisfied they don't limit module portability. Currently Wave is required for Cloud scenarios without a shared file system.
Templates are the more portable, if less pretty, approach.

Any module example you could suggest as a model suitable for this case?

Just look for the template keyword in the modules repo. But here are a couple of examples:

Note that these are not quite script files, and you have to escape $ etc with \.

toniher · 2026-04-14T19:17:35Z

@pinin4fjords I tried to make it work, taking your modules as examples. It was relatively easy using:

if __name__ == "__main__":
    main("--model ${gguf_model} --messages ${prompt_file} --output ${prefix}.txt ${args}")

and a bit more code bits, but... I got another issue when running the test:

│   Caused by:                                                                              │
│     Process output of type 'eval' is only allowed with Bash process scripts -- Current    │
│ interpreter: /usr/bin/env python3

This is caused by the recent convention in the way to generate versions...

    tuple val("${task.process}"), val("llama-cpp-python"), eval("python3 -c 'import llama_cpp; print(llama_cpp.__version__)'"), topic: versions, emit: versions_llama_cpp_python

So, I guess I would need to generate versions.yml the old way back both inside the script and in the stub 😿

pinin4fjords · 2026-04-14T19:22:49Z

So, I guess I would need to generate versions.yml the old way back both inside the script and in the stub 😿

Yep, the eval won't work, just do exactly what the gtf2bed example I gave you does. You still output the versions.yml, but also send it to a topic. The code used by pipelines to assemble versions handles both files and strings.

…s and tests

…dules/blob/master/modules/nf-core/ea-utils/gtf2bed/meta.yml

…com/biocorecrg/nf-core-modules/blob/4e03bf614d8338067c864d8ee38c2a4a738311aa/modules/nf-core/amulety/esm2/main.nf

pinin4fjords · 2026-04-21T14:58:38Z

Did a bit of AI-assisted iteration to make biocorecrg#1 - see what you think. We're still trying to figure out the standard patterns for this- see nf-core/website#4142.

Covers the pattern for GPU-compiled Python tools that ship only as pre-built wheels from custom pip indexes (e.g. llama-cpp-python), based on experience integrating nf-core/modules#11053: - Pin the full wheel URL in pip:, not --extra-index-url (which leaks into Wave's image tag) or --index-url (which breaks transitive deps). - Use wave --config-env 'LD_LIBRARY_PATH=/opt/conda/lib' so the pip binary can resolve conda-provided CUDA libs at dlopen time; conda's activate.d hooks don't fire under docker run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

toniher added 5 commits March 25, 2026 18:55

adding modules for downloading and running gguf modules

7f1b7d7

adding docker support

9031f48

allow custom HF_HOME cache input and other fixes

04cd556

several test fixes

537a891

Upgrade problem with versions and test

20b5d26

toniher added this to Hackathon March 2026 Mar 26, 2026

toniher requested review from edmundmiller and maxulysse as code owners March 26, 2026 11:06

toniher added the new module Adding a new module label Mar 26, 2026

toniher and others added 7 commits March 26, 2026 12:27

fix precommit linting

c9997f5

fix yaml for prettier

5035e82

fix retrieval of version for huggingface

2c796de

Merge branch 'nf-core:master' into llamacpp

4f64bac

importing nextflow.config from HF_DOWNLOAD

ff7039c

Merge branch 'llamacpp' of github.com:biocorecrg/nf-core-modules into…

7a3f8fc

… llamacpp

adding hf_cache for setup as well

fb1768c

toniher requested a review from JoseEspinosa March 26, 2026 14:08

toniher added the Ready for Review label Mar 26, 2026

toniher moved this to Ready for review in Hackathon March 2026 Mar 26, 2026

famosab reviewed Apr 2, 2026

View reviewed changes

Comment thread modules/nf-core/llamacpp-python/run/tests/data/stub_model.gguf Outdated

toniher and others added 8 commits April 4, 2026 18:47

moving HF_DOWNLOAD to HUGGINGFACE_DOWNLOAD https://nf-co.re/docs/guid…

ac7f44c

…elines/components/modules#naming-conventions

Update modules/nf-core/huggingface/download/tests/main.nf.test

26168b7

Not so many assertions for stub test Co-authored-by: Famke Bäuerle <45968370+famosab@users.noreply.github.com>

more detail and naming of Hugging Face

6dfff97

Merge remote-tracking branch 'upstream/master' into llamacpp

2d171ca

Merge branch 'llamacpp' of github.com:biocorecrg/nf-core-modules into…

ee04a27

… llamacpp

linting modules using

3021074

generate files on the fly

4630e5a

rmed data files for tests

53c7826

toniher and others added 5 commits April 8, 2026 17:10

Update modules/nf-core/llamacpp-python/run/main.nf

297f503

output to ${prefix} Co-authored-by: Famke Bäuerle <45968370+famosab@users.noreply.github.com>

Moving name of the module, script name and adapting tests and stubs

ac61a56

update tests

ca4721b

Merge branch 'nf-core:master' into llamacpp

8fdffea

Merge branch 'llamacpp' of github.com:biocorecrg/nf-core-modules into…

9a0d9f3

… llamacpp

famosab reviewed Apr 9, 2026

View reviewed changes

Comment thread modules/nf-core/huggingface/download/tests/main.nf.test Outdated

toniher and others added 4 commits April 9, 2026 16:21

update task.accelerator

d65181d

moving all assertions into the same snapshot

cd3d1b9

removed unneded nextflow.config for test

fb81365

Merge branch 'nf-core:master' into llamacpp

6f2affa

pinin4fjords reviewed Apr 13, 2026

View reviewed changes

pinin4fjords requested changes Apr 13, 2026

View reviewed changes

mashehu reviewed Apr 13, 2026

View reviewed changes

Comment thread modules/nf-core/llamacpppython/run/meta.yml Outdated

toniher added 2 commits April 13, 2026 21:16

addressing some coments, such as cache_dir

dfadccb

lint didn't like, so out

1dbb3e9

toniher and others added 5 commits April 14, 2026 23:07

script moved to work as template and corresponding changes to version…

bfe3bf9

…s and tests

Merge branch 'nf-core:master' into llamacpp

0d594f5

adapt meta to fit versions.yml based on https://github.com/nf-core/mo…

255e7db

…dules/blob/master/modules/nf-core/ea-utils/gtf2bed/meta.yml

adding singularity oras image following this example: https://github.…

9069b44

…com/biocorecrg/nf-core-modules/blob/4e03bf614d8338067c864d8ee38c2a4a738311aa/modules/nf-core/amulety/esm2/main.nf

trying to accomodate nf-core modules lint (sic)

ec9a159

This was referenced Apr 21, 2026

[DRAFT - do not review] llamacpppython/run GPU + huggingface/download preview #11255

Closed

llamacpppython/run: Wave-freezable GPU path + nf-core spec alignment biocorecrg/nf-core-modules#1

Open

Conversation

toniher commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR checklist

Uh oh!

famosab left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

toniher commented Apr 4, 2026

Uh oh!

Uh oh!

famosab commented Apr 13, 2026

Uh oh!

pinin4fjords left a comment

Choose a reason for hiding this comment

Uh oh!

pinin4fjords left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

toniher commented Apr 13, 2026

Uh oh!

pinin4fjords commented Apr 13, 2026

Uh oh!

toniher commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pinin4fjords commented Apr 14, 2026

Uh oh!

pinin4fjords commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

toniher commented Mar 26, 2026 •

edited

Loading

toniher commented Apr 14, 2026 •

edited

Loading

pinin4fjords commented Apr 21, 2026 •

edited

Loading