Skip to content

[DRAFT - do not review] llamacpppython/run GPU + huggingface/download preview#11255

Closed
pinin4fjords wants to merge 41 commits intonf-core:masterfrom
pinin4fjords:pinin4fjords/llamacpp-gpu-fix
Closed

[DRAFT - do not review] llamacpppython/run GPU + huggingface/download preview#11255
pinin4fjords wants to merge 41 commits intonf-core:masterfrom
pinin4fjords:pinin4fjords/llamacpp-gpu-fix

Conversation

@pinin4fjords
Copy link
Copy Markdown
Member

Not for review / will be closed

This draft is open only to preview the combined diff of an existing PR (#11053) plus some proposed follow-up changes on top. It will be closed without merging. Any actual changes will land via that existing PR and/or a PR back into its source branch.

No action needed on this one; leaving it as draft to avoid reviewer attention.

What's being previewed

On top of #11053, this branch:

  • Drops the hand-rolled Dockerfile and adds an environment.gpu.yml (direct cu124 wheel URL + conda-forge::cuda-runtime pinned to 12.4 + python=3.11), built via Wave freeze with LD_LIBRARY_PATH=/opt/conda/lib baked in through wave --config-env so conda-installed CUDA libs are on the loader path without a script-side export.
  • Rewrites the container directive as a four-URL dual-container ternary mirroring ribodetector's shape (singularity https blob URLs + Wave docker URLs, branched on task.accelerator).
  • Drops label 'process_gpu' — accelerator allocation is pipeline-controlled per the draft GPU module spec (docs: add GPU module guidelines website#4142).
  • Converts the CPU singularity URL from oras:// to https:// blob form so both CPU and GPU paths use the same URL scheme.
  • Adds tests/main.gpu.nf.test, tests/nextflow.gpu.config, and a pre-generated main.gpu.nf.test.snap so the existing GPU CI workflow (nf-test-gpu.yml) picks the tests up via the "gpu" tag.
  • Removes nextflow.enable.moduleBinaries from tests/nextflow.config (no longer needed once fully template-based).
  • Fixes a stray - prompt_file: fragment in both meta.yml input blocks.

Validation

  • Wave: CPU container rebuild from environment.yml reproduces the existing hash byte-for-byte. GPU container built from environment.gpu.yml.
  • CPU nf-test passes locally under both -profile docker and -profile singularity.
  • GPU nf-test passes on a Tesla T4 (g4dn.xlarge) under -profile docker,gpu — end-to-end Gemma-3-1B inference ran at ~143 tokens/sec, both real and stub tests produce stable snapshots.
  • nf-core modules lint matches ribodetector's warning set (known Wave-tag / version-heuristic limitation on GPU containers).

toniher and others added 30 commits March 25, 2026 18:55
Not so many assertions for stub test

Co-authored-by: Famke Bäuerle <45968370+famosab@users.noreply.github.com>
output to ${prefix}

Co-authored-by: Famke Bäuerle <45968370+famosab@users.noreply.github.com>
output to ${prefix}

Co-authored-by: Famke Bäuerle <45968370+famosab@users.noreply.github.com>
output to ${prefix}

Co-authored-by: Famke Bäuerle <45968370+famosab@users.noreply.github.com>
toniher and others added 11 commits April 9, 2026 16:55
…ec compliant

Replace the hand-rolled Dockerfile with a Wave-built GPU container sourced
from environment.gpu.yml (pinned CUDA 12.4 runtime + abetlen cu124 wheel),
restructure the container directive to follow the dual-container pattern
used by ribodetector, drop 'process_gpu' label (accelerator is pipeline
controlled per the GPU module spec), and add a main.gpu.nf.test plus
nextflow.gpu.config so the GPU CI workflow picks up the tests via the
'gpu' tag.

Also:
- Convert CPU singularity URL from oras:// to https:// blob form for
  consistency with GPU URL (matches ribodetector convention).
- Drop nextflow.enable.moduleBinaries from tests/nextflow.config (now
  template-based, not binary-based).
- Fix stray '- prompt_file:' text in both meta.yml input blocks.

GPU container was validated end-to-end on a g4dn.xlarge (Tesla T4):
library loads, supports_gpu_offload=True, real gemma-3-1b inference
runs at ~143 tokens/sec. Snapshot file generated on the same host.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pinin4fjords
Copy link
Copy Markdown
Member Author

Preview closed. Follow-up work routes through the existing PR #11053 source branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants