[DRAFT - do not review] llamacpppython/run GPU + huggingface/download preview#11255
Closed
pinin4fjords wants to merge 41 commits intonf-core:masterfrom
Closed
[DRAFT - do not review] llamacpppython/run GPU + huggingface/download preview#11255pinin4fjords wants to merge 41 commits intonf-core:masterfrom
pinin4fjords wants to merge 41 commits intonf-core:masterfrom
Conversation
Not so many assertions for stub test Co-authored-by: Famke Bäuerle <45968370+famosab@users.noreply.github.com>
output to ${prefix}
Co-authored-by: Famke Bäuerle <45968370+famosab@users.noreply.github.com>
output to ${prefix}
Co-authored-by: Famke Bäuerle <45968370+famosab@users.noreply.github.com>
output to ${prefix}
Co-authored-by: Famke Bäuerle <45968370+famosab@users.noreply.github.com>
…ec compliant Replace the hand-rolled Dockerfile with a Wave-built GPU container sourced from environment.gpu.yml (pinned CUDA 12.4 runtime + abetlen cu124 wheel), restructure the container directive to follow the dual-container pattern used by ribodetector, drop 'process_gpu' label (accelerator is pipeline controlled per the GPU module spec), and add a main.gpu.nf.test plus nextflow.gpu.config so the GPU CI workflow picks up the tests via the 'gpu' tag. Also: - Convert CPU singularity URL from oras:// to https:// blob form for consistency with GPU URL (matches ribodetector convention). - Drop nextflow.enable.moduleBinaries from tests/nextflow.config (now template-based, not binary-based). - Fix stray '- prompt_file:' text in both meta.yml input blocks. GPU container was validated end-to-end on a g4dn.xlarge (Tesla T4): library loads, supports_gpu_offload=True, real gemma-3-1b inference runs at ~143 tokens/sec. Snapshot file generated on the same host. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Member
Author
|
Preview closed. Follow-up work routes through the existing PR #11053 source branch. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Not for review / will be closed
This draft is open only to preview the combined diff of an existing PR (#11053) plus some proposed follow-up changes on top. It will be closed without merging. Any actual changes will land via that existing PR and/or a PR back into its source branch.
No action needed on this one; leaving it as draft to avoid reviewer attention.
What's being previewed
On top of #11053, this branch:
Dockerfileand adds anenvironment.gpu.yml(direct cu124 wheel URL +conda-forge::cuda-runtimepinned to 12.4 +python=3.11), built via Wave freeze withLD_LIBRARY_PATH=/opt/conda/libbaked in throughwave --config-envso conda-installed CUDA libs are on the loader path without a script-side export.ribodetector's shape (singularity https blob URLs + Wave docker URLs, branched ontask.accelerator).label 'process_gpu'— accelerator allocation is pipeline-controlled per the draft GPU module spec (docs: add GPU module guidelines website#4142).oras://tohttps://blob form so both CPU and GPU paths use the same URL scheme.tests/main.gpu.nf.test,tests/nextflow.gpu.config, and a pre-generatedmain.gpu.nf.test.snapso the existing GPU CI workflow (nf-test-gpu.yml) picks the tests up via the"gpu"tag.nextflow.enable.moduleBinariesfromtests/nextflow.config(no longer needed once fully template-based).- prompt_file:fragment in bothmeta.ymlinput blocks.Validation
environment.ymlreproduces the existing hash byte-for-byte. GPU container built fromenvironment.gpu.yml.nf-testpasses locally under both-profile dockerand-profile singularity.nf-testpasses on a Tesla T4 (g4dn.xlarge) under-profile docker,gpu— end-to-end Gemma-3-1B inference ran at ~143 tokens/sec, both real and stub tests produce stable snapshots.nf-core modules lintmatchesribodetector's warning set (known Wave-tag / version-heuristic limitation on GPU containers).