The networking to the cluster seems slightly flakey, so I noticed
a few failures when playing with it. A little retry is able to fix it.
The function was taken from deploy.sh. I considered de-duping it,
but deploy.sh runs in a context where
/usr/libexec/tests/osbuild-composer/shared_lib.sh is not yet
established, so it's unfortunately no so simple. :(
The old one is going to be decommissioned. I only changed:
- extracted the storage class to a variable
- adjusted the openshift yaml file to what I was given in the UI
- most importantly, we now use an instancetype to specify the
resource requirements instead of doing it manually
- the network is called default, instead of nic0 on this cluster
- we are downloading the oc and virtctl clients from the new cluster
so the versions match
in many files there was a secondary call to `trap` for the sole purpose
of killing jornalctl (watching worker logs) so that GitLab CI doesn't
hang.
The issue with this is that sometimes the cleared the trap which invokes
the cleanup() function without reinstating it again (not everywhere).
Instead of doing this back-and-forth just make sure we don't leave any
journalctl processes dangling in the background!
NOTES:
- for some scripts, mainly ostree- ones there was no cleanup trap
present, but instead `trap` was configured inside the build_image() function.
The trouble is that this function is executed multiple times and
$WORKER_JOURNAL_PID changes value between these multiple executions.
That's why these scripts introduce the cleanup_on_exit() function where
we make sure to kill any possible dangling journalctl processes.
- The name `cleanup_on_exit()` is chosed because these same scripts
often have a helper function named clean_up() which is sometimes used to remove
virtual machines and other artifacts between calls of build_image().
killing the worker journal via EXIT signal prevents the cleanup()
function from executing!
NOTE: this is a problem in other scripts as well and needs to be
refactored there too!