parallel operations for some binary caching providers#1392
parallel operations for some binary caching providers#1392Crzyrndm wants to merge 4 commits intomicrosoft:mainfrom
Conversation
This comment was marked as off-topic.
This comment was marked as off-topic.
|
In general, you're parallelizing the download here, but we might expect most of the time this to not be CPU limited and thus not improve as a result of parallelization. Do you have a real world use case where this did end up being meaningfully faster? |
|
For clarity - my experience here is with an AWS S3 cache specifically Taking a project I have with 10 small dependencies (download size is minimal)
The lib uses vcpkg in manifest mode via CMake toolchain integration. Build machine is Windows 11 with i7-12700 (8P + 4E) vcpkg/vcpkg --version
vcpkg package management program version 2024-03-14-7d353e869753e5609a1f1a057df3db8fd356e49d
# all measurements are from the cmake configure output
cmake --preset=<preset name>AWS S3 cache missAWS S3 cache hitSummarySo approx a 5x improvement for both query and download and that should be fairly proportional to the total number of dependencies (if this was done correctly and all deps are relatively equivalent). Note that typically I only use the cache for CI (GHA) and the time per package can be 1-3 seconds. If it was 5s to check the AWS cache I would probably consider using that locally as well, even better if microsoft/vcpkg#38684 was a thing |
|
I agree a real world 5x improvement is worth opening a can of worms for; OK |
this is pushing to the different caches(?) in parallel, not by package like the others
|
Thoughts on where to go with this? |
|
|
| } | ||
| else | ||
| { | ||
| out_sink.println_warning(res.error()); |
There was a problem hiding this comment.
Missed prints not being synchronised in previous assessment.
- Make individual print calls thread safe #1290
- Introduce DiagnosticContext, an 'error document' like type. #1323
This particular print could be extracted relatively easily and is the only print inside the parallel execution.
This is a simplified reimplementation of microsoft#1392.
This is a simplified reimplementation of microsoft#1392.
This is a simplified reimplementation of microsoft#1392.




microsoft/vcpkg#38404
This adds parallel operations using
parallel_for_each/parallel_transformtoObjectStorageProvider. There is currently no limitation placed on the parallelisation past that imposed by the threadpool.This impacts the following caching providers
GcsStorageToolAwsStorageToolCosStorageToolQuestions