Skip to content

Add CRR cascaded tests#2447

Open
SylvainSenechal wants to merge 2 commits into
development/2.15from
improvement/ZENKO-5263
Open

Add CRR cascaded tests#2447
SylvainSenechal wants to merge 2 commits into
development/2.15from
improvement/ZENKO-5263

Conversation

@SylvainSenechal

@SylvainSenechal SylvainSenechal commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Issue: ZENKO-5263

Crr cascaded tests, didn't add very complex scenarios, let me know if you can come up with other test ideas but really not sure. I think there is only one happy path to test, and for the potential problematic situation, there is loop which I tested, and the rest not really sure we can test easily (like concurrency on write and stale micro version id)

Wont work in ci, as deps.yaml isn't updated with cloudserver/backbeat.
BUT : "It works on my computer codespace"

@bert-e

bert-e commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Hello sylvainsenechal,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Available options
name description privileged authored
/after_pull_request Wait for the given pull request id to be merged before continuing with the current one.
/bypass_author_approval Bypass the pull request author's approval
/bypass_build_status Bypass the build and test status
/bypass_commit_size Bypass the check on the size of the changeset TBA
/bypass_incompatible_branch Bypass the check on the source branch prefix
/bypass_jira_check Bypass the Jira issue check
/bypass_peer_approval Bypass the pull request peers' approval
/bypass_leader_approval Bypass the pull request leaders' approval
/approve Instruct Bert-E that the author has approved the pull request. ✍️
/create_pull_requests Allow the creation of integration pull requests.
/create_integration_branches Allow the creation of integration branches.
/no_octopus Prevent Wall-E from doing any octopus merge and use multiple consecutive merge instead
/unanimity Change review acceptance criteria from one reviewer at least to all reviewers
/wait Instruct Bert-E not to run until further notice.
Available commands
name description privileged
/help Print Bert-E's manual in the pull request.
/status Print Bert-E's current status in the pull request TBA
/clear Remove all comments from Bert-E from the history TBA
/retry Re-start a fresh build TBA
/build Re-start a fresh build TBA
/force_reset Delete integration branches & pull requests, and restart merge process from the beginning.
/reset Try to remove integration branches unless there are commits on them which do not appear on the source branch.

Status report is not available.

@bert-e

bert-e commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Incorrect fix version

The Fix Version/s in issue ZENKO-5263 contains:

  • None

Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:

  • 2.15.2

Please check the Fix Version/s of ZENKO-5263, or the target
branch of this pull request.

Comment thread solution/deps.yaml Outdated
# yq eval 'sortKeys(.)' -i deps.yaml
backbeat:
sourceRegistry: ghcr.io/scality
sourceRegistry: ghcr.io/scality/playground/sylvainsenechal

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

backbeat and cloudserver point to a personal playground registry (ghcr.io/scality/playground/sylvainsenechal) with dev tags (9.4.101, 9.3.101). Per review criteria, Scality-internal images must resolve to tags published on ghcr.io/scality. These need to be reverted before merge.

Comment thread .github/scripts/end2end/setup-e2e-env.sh Outdated
@SylvainSenechal SylvainSenechal force-pushed the improvement/ZENKO-5263 branch 3 times, most recently from ede39d5 to 56052d7 Compare June 23, 2026 16:07
Comment thread tests/functional/ctst/steps/crrCascade.ts Outdated
@SylvainSenechal SylvainSenechal force-pushed the improvement/ZENKO-5263 branch from 56052d7 to ee17d07 Compare June 23, 2026 19:08
@scality scality deleted a comment from claude Bot Jun 23, 2026
@SylvainSenechal SylvainSenechal force-pushed the improvement/ZENKO-5263 branch 2 times, most recently from 32f34cc to 664e381 Compare June 23, 2026 22:09
CRR_DESTINATION_LOCATION_NAME=${CRR_DESTINATION_LOCATION_NAME} \
CRR_SOURCE_ACCOUNT_NAME=${CRR_SOURCE_ACCOUNT_NAME} \
CRR_DESTINATION_ACCOUNT_NAME=${CRR_DESTINATION_ACCOUNT_NAME} \
CRR_CASCADE_LOCATION_A_NAME=${CRR_CASCADE_LOCATION_A_NAME} \

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, in the past, we talked about avoiding creating too many locations but 🤔
I don't know, I think it's probably fine ?

  • we still don't have that many locations
  • It's probably not a bad idea to create many locations in ci, to see if it actually creates issues or not

Comment thread solution/deps.yaml Outdated
@SylvainSenechal SylvainSenechal force-pushed the improvement/ZENKO-5263 branch from 664e381 to c3043bf Compare June 23, 2026 22:15
while (Date.now() < deadline) {
Identity.useIdentity(IdentityEnum.ACCOUNT, location);
try {
await this.createS3Client().send(new HeadObjectCommand({ Bucket: bucket, Key: objectName }));

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kept it light and very functional for this test : only check that the object is here, not checking further thing like specific object metadata/status. Could add, not sure it's super useful for functional tests, although I know sometimes we did it in the past

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should test replication-status ; but not use if this can be check here (may be race conditions -esp. when using cascade- so the status may have intermediate value...)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah there are 2 things to check :

Metadata: { 'crr-location-c-replication-status': 'PENDING' },
ReplicationStatus: 'REPLICA',

ReplicationStatus will always be REPLICA, because its based on the new isReplica boolean
But in metadata, depending how fast we check, it will be either PENDING or REPLICA, as a cascade transition will turn a replica into an object thats pending a replication.

I'm adding some check here, and i will add a step at the end to check all the status

Comment thread tests/functional/ctst/steps/crrCascade.ts Outdated
@SylvainSenechal SylvainSenechal force-pushed the improvement/ZENKO-5263 branch 2 times, most recently from d7560c3 to c8df3be Compare June 23, 2026 22:26
@SylvainSenechal SylvainSenechal marked this pull request as ready for review June 23, 2026 22:28
@SylvainSenechal SylvainSenechal requested review from a team, delthas and maeldonn June 23, 2026 22:28
@SylvainSenechal SylvainSenechal force-pushed the improvement/ZENKO-5263 branch from c8df3be to c70930c Compare June 24, 2026 13:39
When an object "cascade-loop-obj" is put in location "crr-location-a"
Then the object should replicate to location "crr-location-b" within 300 seconds
And the object should replicate to location "crr-location-c" within 300 seconds
And the object at location "crr-location-a" should never have replication status PENDING within 30 seconds

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should also check the "final" replication status on each location : SUCCESS / REPLICA / REPLICA ?

When an object "cascade-loop-obj" is put in location "crr-location-a"
Then the object should replicate to location "crr-location-b" within 300 seconds
And the object should replicate to location "crr-location-c" within 300 seconds
And the object at location "crr-location-a" should never have replication status PENDING within 30 seconds

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have more complex scenario, where we write on multiple points in the loop?
may not be able to control how he operations race to ensure a specific scenario, but worth adding to ensure the system is "eventually" consistent and stable?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I came up with a more complex scenario which I think is decent :

  • Loop config
  • Concurrent writes on all location, writing a specific random metadata marker
    => Assert that they all converge on the same metadataMarker

Tbh, still not the perfect test but I think worth doing

And replication is configured from location "crr-location-a" to "crr-location-b"
When an object "cascade-obj" is put in location "crr-location-a"
Then the object should replicate to location "crr-location-b" within 300 seconds
And the object should replicate to location "crr-location-c" within 300 seconds

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should also check the "final" replication status on each location : SUCCESS / REPLICA / REPLICA ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I'm gonna add a step at the end to check all the status

bucketMatch: false
repoId: []
- name: "${CRR_LOCATION_A_NAME}"
locationType: "location-scality-crr-v1"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do we really need multiple locations to test the feature? could we not test with a single location and multiple buckets, yet get a similar result & good confidence?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I discussed this with Maël too and I was thinking the same thing, technically it should be possible since the destination bucket is not bounded to the location, but in the end I think I prefer working with multiple locations as it forces us to build more realistic test scenarios

while (Date.now() < deadline) {
Identity.useIdentity(IdentityEnum.ACCOUNT, location);
try {
await this.createS3Client().send(new HeadObjectCommand({ Bucket: bucket, Key: objectName }));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should test replication-status ; but not use if this can be check here (may be race conditions -esp. when using cascade- so the status may have intermediate value...)


Then(
'the object at location {string} should never have replication status PENDING within {int} seconds',
{ timeout: 120_000 },

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kind of redundant, deadline/timeout is implemented in the loop itself ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah but prefer to keep it, it could avoid ending up with 5h ci in case of a bug

@SylvainSenechal SylvainSenechal force-pushed the improvement/ZENKO-5263 branch 2 times, most recently from 0f6db0b to b3de8ff Compare June 26, 2026 14:03
Comment thread tests/functional/ctst/features/replication/crrCascade.feature Outdated
Comment thread tests/functional/ctst/steps/crrCascade.ts Outdated

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realising here I have 3 scenarios, and each new scenario pretty much covers the case of the previous scenario so in theory we could just run the last one to test the feature, I think still good to have a few scenarios

@SylvainSenechal SylvainSenechal force-pushed the improvement/ZENKO-5263 branch from b3de8ff to 900f7de Compare June 26, 2026 14:15
@PreMerge
@ReplicationTest
@CRRCascade
@haha

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haha is a leftover debug tag (renamed from @hihi but still present). Remove before merge.

Suggested change
@haha

@francoisferrand francoisferrand removed their request for review June 26, 2026 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants