Skip to content

Bump Cli-testing to 1.3.0, update kafka libraries#2341

Open
SylvainSenechal wants to merge 21 commits intodevelopment/2.14from
improvement/ZENKO-5203
Open

Bump Cli-testing to 1.3.0, update kafka libraries#2341
SylvainSenechal wants to merge 21 commits intodevelopment/2.14from
improvement/ZENKO-5203

Conversation

@SylvainSenechal
Copy link
Contributor

@SylvainSenechal SylvainSenechal commented Mar 2, 2026

Issue: ZENKO-5203

  • Bump cli testing to 1.3.0 => Breaking changes around the keycloak seeder, so updated a few things related to that
  • Updated node runner to 24
  • Updated cucumber version from 10.x to 12.x
  • Updated kubernetes client from 0.2 to 1.4...
  • Kafka libraries : align all kafka interactions on the new JS native kafka library : platformatic/kafka => drop kafkajs, drop node-rdkafka (and drop the annoying node-gyp that was needed for node-rdkafka, which was always causing weird python warnings when installing). => This new kafka library is compatible with Bunm unlike the previous ones, so later we can investigate if using Bun is worth it (I believe it is 🌚 ), before that we could'nt even try because of this lib

Ended up adding some extra stuff as I was testing my changes in Codespace and found a few stuff bothering me. I think the pr is still not too hard to review.

Can review commit by commit too

@bert-e
Copy link
Contributor

bert-e commented Mar 2, 2026

Hello sylvainsenechal,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Available options
name description privileged authored
/after_pull_request Wait for the given pull request id to be merged before continuing with the current one.
/bypass_author_approval Bypass the pull request author's approval
/bypass_build_status Bypass the build and test status
/bypass_commit_size Bypass the check on the size of the changeset TBA
/bypass_incompatible_branch Bypass the check on the source branch prefix
/bypass_jira_check Bypass the Jira issue check
/bypass_peer_approval Bypass the pull request peers' approval
/bypass_leader_approval Bypass the pull request leaders' approval
/approve Instruct Bert-E that the author has approved the pull request. ✍️
/create_pull_requests Allow the creation of integration pull requests.
/create_integration_branches Allow the creation of integration branches.
/no_octopus Prevent Wall-E from doing any octopus merge and use multiple consecutive merge instead
/unanimity Change review acceptance criteria from one reviewer at least to all reviewers
/wait Instruct Bert-E not to run until further notice.
Available commands
name description privileged
/help Print Bert-E's manual in the pull request.
/status Print Bert-E's current status in the pull request TBA
/clear Remove all comments from Bert-E from the history TBA
/retry Re-start a fresh build TBA
/build Re-start a fresh build TBA
/force_reset Delete integration branches & pull requests, and restart merge process from the beginning.
/reset Try to remove integration branches unless there are commits on them which do not appear on the source branch.

Status report is not available.

@SylvainSenechal SylvainSenechal force-pushed the improvement/ZENKO-5203 branch 7 times, most recently from 50ff85e to b441c6e Compare March 4, 2026 09:49
@SylvainSenechal SylvainSenechal force-pushed the improvement/ZENKO-5203 branch 2 times, most recently from b0358e1 to e39e7a2 Compare March 11, 2026 15:10
"build": "tsc --build tsconfig.json",
"lint": "eslint ."
},
"resolutions": {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related to issues newer azure libraries : https://scality.atlassian.net/browse/ZENKO-5213

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an infinite loop, although catched by global test timeout, still nicer to properly control while loops

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are all small syntax changes because I did a huge bump of the kubernetes client from version 0.x to 1.4.0

* new `putBucketNotificationConfiguration` to the connector before the
* test proceeds to trigger events.
*/
export async function waitForBucketInConnectorPipeline(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used in bucket notification : Before, we were doing

  • putBucketNotificationConfiguration
  • sleep for 10sec and pray the oplog has picked up the change and updated kafka connector.

It was risky, because triggering the notification event before kafka connector was setup would imply the notification would never be sent

Now we detect the kafka connector is really setup.

I'm kinda sad though : I thought this would fix bucket notif flaky, but it doesn't, so there are other things to check

assert.strictEqual(this.getResult().err?.includes('AccessDenied') ||
this.getResult().err?.includes('403'), true);
this.getResult().err?.includes('403')||
this.getResult().statusCode === 403, true);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, I got issues with this part not catching the 403 anymore.
Not fully sure why, but with the bump of the cli testing, the version of the sdk has increased quite a lot, from ~3.900 to ~3.100, maybe something changed I don't know

Given that lifecycle is "resumed" for the "e2e-azure-archive" location
Then object "obj-1" should be "transitioning" and have the storage class "e2e-azure-archive"
And object "obj-2" should be "transitioning" and have the storage class "e2e-azure-archive"
# This test is flaky, and doesn't make much sense as it is :
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@francoisferrand Let's discuss this. I only commented for now, maybe we wanna keep it ?

But looking at the test it doesnt make much sense (cf the comment I added in the code).
But at the same time I feel like I'm missing something, there must be a reason why this test was created

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to resolve this before merging 😉

cleanupAccount,
} from './utils';

import 'cli-testing/hooks/KeycloakSetup';
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have BeforeAll in Cli-testing, no import => they are not run
Imo should be in this repo but its a story for another day

POD_NAME="${ZENKO_NAME}-ctst-tests"
CTST_VERSION=$(sed 's/.*"cli-testing": ".*#\(.*\)".*/\1/;t;d' ../../../tests/ctst/package.json)

# Configure keycloak
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seed keycloak is now done in a before all from Cli-testing

import { safeJsonParse } from './utils';
import assert from 'assert';
import { Admin, Kafka } from 'kafkajs';
import { Admin } from '@platformatic/kafka';
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched the kafka library
Before we were using kafkajs, and node-rdkafka.

These internally relied on librdkafka, a C library. Because of that, we needed to setup quite a few extra stuff to make them work : node-gyp (which was throwing weird python errors when doing yarn install in the Codespace), and also need a bunch of things to be installed in the dockerfile (you'll see later in the pr, librd-XXX).
Now this library is pure js, it removes some dependency, and also it will be compatible with Bun if we wanna use it later

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notes:

  • such comments would be better in the PR comment, so reviewers know before even reading the code
  • would be good to keep such change in a separate PR (even if it means stacking multiple PRs) - not saying to change it now (too late), but as a goal for next changes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the kafka change!

const tarName = await isObjectRehydrated(this, objectName);
assert(tarName);
await AzureHelper.sendBlobCreatedEventToQueue(
const sent = await AzureHelper.sendBlobCreatedEventToQueue(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lost crazy amount of time on this :
sendBlobCreatedEventToQueue was erroring, the error wasn't thrown but instead the function return a boolean that was never checked -_-

@SylvainSenechal SylvainSenechal force-pushed the improvement/ZENKO-5203 branch from e39e7a2 to dedf6c3 Compare March 12, 2026 13:05
@SylvainSenechal SylvainSenechal marked this pull request as ready for review March 12, 2026 13:08
@SylvainSenechal SylvainSenechal requested review from a team, benzekrimaha, francoisferrand and maeldonn and removed request for benzekrimaha March 12, 2026 13:08
Comment on lines +116 to +117
const low = earliestResult[0]?.partitions.find(p => p.partitionIndex === i)?.offset ?? BigInt(0);
const high = latestResult[0]?.partitions.find(p => p.partitionIndex === i)?.offset ?? BigInt(0);
Copy link
Contributor

@francoisferrand francoisferrand Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

partitions are lists, right? So searching is not efficient...

  • are partitions "ordered" predictably?
  • do earliest and latest have the same order?
  • ...could we not just do a "zip" merge, something like:
for (let i = 0; i < partitionCount; i++) {
   partitions.push({ low: earliestResult[0]?.partitions[I], high: latestResult[0]?.partitions[I] });

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't have guarantee about the order, but the list should be small enough that in practice this find is not too impactfull

@SylvainSenechal SylvainSenechal force-pushed the improvement/ZENKO-5203 branch from dedf6c3 to 4a27e41 Compare March 13, 2026 10:15
@bert-e
Copy link
Contributor

bert-e commented Mar 13, 2026

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • 2 peers

@scality scality deleted a comment from bert-e Mar 13, 2026
@SylvainSenechal SylvainSenechal force-pushed the improvement/ZENKO-5203 branch from 4a27e41 to 0c6fedd Compare March 13, 2026 13:24
Copy link
Contributor

@maeldonn maeldonn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

FROM ghcr.io/scality/zenko-drctl:$DRCTL_TAG AS drctl

FROM node:22.19.0-bookworm-slim
FROM node:24-bookworm-slim
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we pin the version like before ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah gonna use 24.14

Given that lifecycle is "resumed" for the "e2e-azure-archive" location
Then object "obj-1" should be "transitioning" and have the storage class "e2e-azure-archive"
And object "obj-2" should be "transitioning" and have the storage class "e2e-azure-archive"
# This test is flaky, and doesn't make much sense as it is :
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to resolve this before merging 😉

// S3 is correct.
assert.strictEqual(this.getResult().err?.includes('AccessDenied') ||
this.getResult().err?.includes('403'), true);
this.getResult().err?.includes('403')||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to keep this condition ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% sure anymore tbh, might have been something that failed while i was working, but probably only need one of the two now.
But a/b testing by launching the ci again with and without i too long to just fix this

@SylvainSenechal SylvainSenechal force-pushed the improvement/ZENKO-5203 branch from 0c6fedd to 879ebee Compare March 13, 2026 15:46
@francoisferrand
Copy link
Contributor

/after_pull_request=2357

@bert-e
Copy link
Contributor

bert-e commented Mar 18, 2026

Waiting for other pull request(s)

The current pull request is locked by the after_pull_request option.

In order for me to merge this pull request, run the following actions first:

➡️ Merge the OPEN pull request:

Alternatively, delete all the after_pull_request comments from this pull request.

The following options are set: after_pull_request

…he DLQ was ~250 in the tests, stopping sorbetclt from restoring objects

Issue: ZENKO-5203
@SylvainSenechal SylvainSenechal force-pushed the improvement/ZENKO-5203 branch from 879ebee to 5ab91e4 Compare March 18, 2026 14:15
@SylvainSenechal SylvainSenechal force-pushed the improvement/ZENKO-5203 branch from 5ab91e4 to 930707a Compare March 18, 2026 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants