dev-infrastructure: rewrite PKO cleanup script in Go (AROSLSRE-789)#5149
dev-infrastructure: rewrite PKO cleanup script in Go (AROSLSRE-789)#5149
Conversation
|
Skipping CI for Draft Pull Request. |
|
/test all |
|
The Namespace("").DeleteCollection(...) call for namespaced resources like Package and ObjectSet likely doesn't do what we want — cross-namespace List and Watch work at the cluster-scoped URL, but mutations like deleteCollection appear to require a specific namespace. The original shell script handled this with kubectl delete --all-namespaces --all, which lists first then deletes per namespace, but that didn't get translated to the Go equivalent. Without this, the namespaced PKO CRs never get deleted. The finalizer stripping step still works correctly (it uses cross-namespace List then Patch per namespace), so future HCP deletions won't get blocked. The main impact is that orphaned PKO CRs linger in HCP namespaces until those namespaces are deleted, and the PKO CRDs can't be removed because CRs still reference them. The fix lists across all namespaces first, collects the distinct namespaces containing CRs, then calls DeleteCollection per namespace — matching what kubectl does internally. Also added an IsNotFound guard on CRD deletion to avoid spurious error logs if a CRD disappears between discovery and the delete call. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: raelga The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/test all |
06cfe9f to
4457e19
Compare
Replace the shell script with a Go program for proper error handling and control flow. The shell version had issues with error propagation that caused EV2 rollout failures.
Log errors but always exit 0 so the cleanup never blocks EV2 rollouts. Error count is tracked and reported in the summary line.
4457e19 to
25b3441
Compare
|
/test all |
|
@raelga: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
AROSLSRE-789
What
Rewrite
cleanup-pko-resources.shas a Go program with proper error handling and control flow.Why
The shell version had issues with error propagation that caused EV2 rollout failures (#5145 reverted the original script). Shell's error handling model (
set -o errexit+|| truepatterns) makes it hard to reason about control flow in complex cleanup logic.Per Steve's suggestion: "consider writing this in Go — we've surpassed what we should be doing in Shell."
Changes
kubectlCLI callsset -o errexit/ `mapfile+grepfor CRD discoveryjsonpathoutput parsingkubectl patchfor finalizerskubectl deletewith--timeoutBehavior
Best-effort — logs errors but always exits 0 so it never blocks EV2 rollouts. Error count tracked and reported in summary line. Identical cleanup logic:
package-operator.runAPI groupIdempotent — safe on clusters that never had PKO.
Testing
Supersedes #5147