Skip to content

Support python in profiling cli#235

Open
GuillaumeLagrange wants to merge 4 commits intocod-1736-use-correct-post-processing-depending-on-the-languagefrom
cod-2035-python-in-profiling-cli
Open

Support python in profiling cli#235
GuillaumeLagrange wants to merge 4 commits intocod-1736-use-correct-post-processing-depending-on-the-languagefrom
cod-2035-python-in-profiling-cli

Conversation

@GuillaumeLagrange
Copy link
Contributor

No description provided.

@codspeed-hq
Copy link

codspeed-hq bot commented Feb 5, 2026

Merging this PR will not alter performance

✅ 4 untouched benchmarks


Comparing cod-2035-python-in-profiling-cli (cb89037) with cod-1736-use-correct-post-processing-depending-on-the-language (6279c9b)

Open in CodSpeed

@GuillaumeLagrange GuillaumeLagrange changed the title Cod 2035 python in profiling cli Support python in profiling cli Feb 5, 2026
Copy link
Member

@not-matthias not-matthias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really need to make sure the stack unwinding doesn't cause regressions. The rest looks good 👍

{
let perf_maps = pids
.iter()
.into_iter()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont forget to dedup the pids here, that's why we used HashSet

where
I: IntoIterator<Item = libc::pid_t>,
{
for pid in pids {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment here, make sure we're iterating only over the dedup/unique pids. Otherwise we can easily run into issues where we duplicate work and/or overwrite files

@@ -1,14 +1,13 @@
use crate::prelude::*;
use std::collections::HashSet;
use std::fs;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this function is async, and the BenchmarkData::save_to function is now also async we could switch to tokio::fs instead of std::fs here.

cmd.args(&benchmark_cmd.command[1..]);
// Use LD_PRELOAD to inject instrumentation into the child process
cmd.env("LD_PRELOAD", preload_lib_path);
cmd.env("PYTHONPERFSUPPORT", "1");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe add a quick comment that we need this so that python generates perf maps (technically it's in the commit message, but comments make it a bit more obvious)

Comment on lines +108 to +110
// Note that the higher the stack size, the larger the file, although it is mitigated
// by zstd compression
(UnwindingMode::Dwarf, Some(32 * 1024))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you test this on pytest-codspeed and check the final report sizes? Last time when we tried it, we had some issues there.

It would also be useful to know when we could run into those limits since we're saving 4x more data now.

Ok(())
}

fn bail_if_command_spawned_subprocesses_under_valgrind(pid: u32) -> Result<()> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have a comment here that explains a bit why and how this works (basically what we discussed in the DMs); don't spend too much time on it, just want to make sure we have an explanation when we come back to this code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants