Skip to content

Towards a full regression test#215

Merged
Radonirinaunimi merged 19 commits intomainfrom
ci-benchmarks
Mar 11, 2025
Merged

Towards a full regression test#215
Radonirinaunimi merged 19 commits intomainfrom
ci-benchmarks

Conversation

@Radonirinaunimi
Copy link
Copy Markdown
Member

@Radonirinaunimi Radonirinaunimi commented Feb 21, 2025

In order to quickly wrap up #206, I realize I better as well address this now. This will not only check that #206 is properly working but will also address #194.

Here are the various checks that need to be done:

  • clarify which theories/grids should be used and check that the template operator cards are the correct ones (currently using 40008005)
  • pineko is able to generate EKOs
  • pineko is able to produce FK tables by convolving the grids with the EKOs
  • check that the produced FK tables are compatible with the prior grids by convolving them with PDFs (differences are checked to be below 2 permille, ATLAS_SINGLETOP_8TEV_T-RAP-NORM is slightly above 1 permille)
  • check that the resulting FK tables are consistent with some references by convolving them with PDFs (differences are checked to be below 1 permille)
  • include varieties of NNPDF datasets - this should be representative of the processes included in the fit
    • HERA_CC_318GEV_EP-SIGMARED
    • ATLAS_Z0_7TEV_36PB_ETA
    • LHCB_WPWM_8TEV_MUON_Y
    • ATLAS_SINGLETOP_8TEV_T-RAP-NORM
    • NNPDF_POS_2P24GEV_F2D
  • check cases in which multiple EKOs, polarized⊗space-like⊗time-like, are needed for a single grid
    • STAR_WMWP_510GEV_WP-AL

I will at some point write some documentations (while also updating the deprecated information) but for the time I am putting the steps here:

  • The information regarding the experimental data and the theory cards are fetched from nnpdf_data which is installed with pineko.
  • The data (such as grids and template operator cards) are stored in https://data.nnpdf.science/pineko/theory_productions/. theory_productions also contains the correct folder structure to produce the theory predictions as described in pineko.ci.toml.
  • pineko.ci.toml contains the pineko configuration to run the regression test
  • The theory_productions data is downloaded once for a given version and then cached in order to not download it every time the workflow is running. If new data are introduced in the theory_productions server, the version at the following line needs to be incremented:
    key: theory_productions-v1
  • Whenever new data are added to the theory_productions server, they should be put in the correct folders according to the configurations defined in pineko.ci.toml.

Note

Currently, this is using the a runner pineko-stbc3 hosted in the Nikhef stoomboot cluster under my user-space. The status of the runner can be tracked in settings > actions > runners.

@Radonirinaunimi Radonirinaunimi added the enhancement New feature or request label Feb 21, 2025
@Radonirinaunimi Radonirinaunimi linked an issue Feb 21, 2025 that may be closed by this pull request
6 tasks
@Radonirinaunimi Radonirinaunimi added the run-regression Trigger the regression test label Feb 21, 2025
@Radonirinaunimi Radonirinaunimi marked this pull request as draft February 21, 2025 22:41
@Radonirinaunimi
Copy link
Copy Markdown
Member Author

Radonirinaunimi commented Feb 22, 2025

So it looks like we need a self-hosted runner after all - the github one cannot deal with the numba compilation into machine codes. I will set up one on Stoomboot first and check that it runs properly and we can decide afterwards where to host it permanently.

@Radonirinaunimi Radonirinaunimi marked this pull request as ready for review February 26, 2025 15:55
Copy link
Copy Markdown
Member

@scarlehoff scarlehoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for this. I guess in Amsterdam I'll ask you how to set up the runner in the cluster so we can have more than one.

Let me know if it is finished and I'll try running it myself and review it asap.

(thanks again ¡!)

Comment thread .github/workflows/regression.yml Outdated
Comment thread pineko.cli.toml Outdated
@Radonirinaunimi Radonirinaunimi added run-regression Trigger the regression test and removed run-regression Trigger the regression test labels Feb 27, 2025
@Radonirinaunimi
Copy link
Copy Markdown
Member Author

With a hadronic dataset that contains ~30 $Q^2$ points, it takes about 20 mn to compute everything. I believe we can just add one more DIS dataset with much less $Q^2$ points (still looking for the one).

I guess in Amsterdam I'll ask you how to set up the runner in the cluster so we can have more than one.

Yes! If there are more computers/clusters that we can use the better.

@Radonirinaunimi Radonirinaunimi mentioned this pull request Feb 28, 2025
1 task
@felixhekhorn
Copy link
Copy Markdown
Contributor

one more DIS dataset with much less Q 2 points (still looking for the one)

actually, why don't you use an F2 positivity set? it is a DIS set but has a single Q2

@Radonirinaunimi
Copy link
Copy Markdown
Member Author

actually, why don't you use an F2 positivity set? it is a DIS set but has a single Q2

That is a good idea! I will add one.

@Radonirinaunimi Radonirinaunimi added run-regression Trigger the regression test and removed run-regression Trigger the regression test labels Mar 11, 2025
@Radonirinaunimi Radonirinaunimi merged commit 8d65739 into main Mar 11, 2025
7 checks passed
@felixhekhorn felixhekhorn mentioned this pull request Jul 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request run-regression Trigger the regression test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Full regression test

3 participants