Skip to content

test: add retry logic to TLS certificate validation in E2E test#5192

Draft
cssjr wants to merge 2 commits intomainfrom
ARO-26761/fix-tls-handshake-timeout
Draft

test: add retry logic to TLS certificate validation in E2E test#5192
cssjr wants to merge 2 commits intomainfrom
ARO-26761/fix-tls-handshake-timeout

Conversation

@cssjr
Copy link
Copy Markdown
Collaborator

@cssjr cssjr commented May 8, 2026

Wrap API TLS certificate check in Eventually block to allow cluster API endpoint time to become available and serve valid certificates after cluster creation completes. Retries for 5 minutes with 10-second polling.

ARO-26761

What

Introduces retry logic for TLS certificate validation in E2E tests and addresses code review feedback to improve efficiency and error reporting.

Why

The test was experiencing false negative failures because the API endpoint wasn't immediately available with valid certificates after cluster creation completed. The cluster needs time to provision and configure TLS certificates.

Changes

Initial implementation

  • Wrapped TLS certificate validation in Ginkgo Eventually block with 5-minute timeout and 10-second polling
  • Converted hard Expect assertions to retryable error returns
  • Added contextual error messages for better debugging

Code review improvements (addressing Copilot feedback)

  • Moved cluster client creation outside the Eventually block to avoid recreating the client on every poll (performance optimization)
  • Enhanced error messages with API URL and certificate issuer details for better debugging when verification fails
  • Maintained diagnostic logging of certificate issuer for test troubleshooting

Testing

Tested in personal-dev-env. The retry logic works correctly - test retries for the full 5-minute window. CI will validate against proper environments.

Notes

This is a test-only change with no impact on production code.

Address Copilot code review feedback:
- Move cluster client creation outside Eventually block to avoid recreating on every poll
- Remove per-attempt error logging to prevent CI log spam during retries
- Add contextual error wrapping with API URL and certificate issuer for better debugging

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 8, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: cssjr
Once this PR has been reviewed and has the lgtm label, please assign miquelsi for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 8, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@cssjr
Copy link
Copy Markdown
Collaborator Author

cssjr commented May 8, 2026

/test all

@cssjr cssjr added ai-assisted AI/LLM tool was used to help create this MR and removed ai-assisted AI/LLM tool was used to help create this MR labels May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant