Skip to content

Full regression test #194

@scarlehoff

Description

@scarlehoff

pineko is a software central to any PDF fit, as it is used to create the fktables by evolving grids down to the fitting scale.

As such, is very important that:

  1. Any changes keep results unchanged or, if there are changes, they are understood.
  2. Updates in nnpdf, eko and pineappl are kept track of (this is important since pineko is mostly feature complete, and we should try to make sure it doesn't lag to far behind).

The best way to accomplish both points is with regression tests that can be run with PRs or whenever there are changes to the dependencies like in #190

The tests should then:

  • Use the latest nnpdf release or repository (since we are still not putting the releases in pip... soon). This will ensure that pineko is able to read the data and theory from the NNPDF data files (crucial importance for the fit!!)
  • Create the operator cards:
  • Create the ekos (this ensures that a) pineko is still compatible with eko b) the theory information from nnpdf is still compatible with pineko -> eko)
  • Create the fktables by convoluting the ekos with the grids (ensures that the interface pineappl-eko used by pineko is still working as expected)
  • Check (by convoluting with the right PDF) that the fktable and the grid are compatible
  • Check (by comparing to previous results) that the results haven't changed to a certain precision

This should be done for a bunch of DIS grids, and a bunch of double-hadronic grids.
And it should be done for a bunch of theories, it should include 41_000_000 and, hopefully, also a polarized theory.
Take into account the two limitations of github workflows in terms of runtime and memory (so we cannot do it with the full dataset).

Some extra (bonus) points:

  • In order to start, add a few grids to the repository in your branch so you don't have to deal with authentication to download them or anything for the time being. We can work on that after the regression tests are ready.
  • You probably want to not run this in every commit. Maybe only for PR that are "ready for review" or using a specific label for it. You can leave this point for the end.
  • Github allows to have "self hosted runners". We might want to explore that option in order to do a more comprehensive test.

You can look at these workflows for inspiration:

https://github.com/NNPDF/pineko/blob/main/.github/workflows/bench.yml (note that you don't need pylint, coverage or anything like that, it should focus on the regression and the regression only)
https://github.com/NNPDF/nnpdf/blob/master/.github/workflows/regression_tests.yml

This will teach you:

  1. How to generate fktables starting from scratch
  2. How to use all the moving parts that create a theory.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions