When the worker executor starts up, many error messages and warnings are
shown in the system logs, worker-initialization.service should actually
not run at all. The service crashes and functionally that's fine, but
it just messes up the log, raises questions and can be avoided by just
not running it.
The `cloud-init.target` in 9.6 has `After=multi-user.target` in its unit
config. The worker initialization service was set to run before
`multi-user.target`, but after `cloud-final.service`. This created an
impossible situation and systemd just disabling the initialization
service.
So this changes:
`multi-user.target -> worker-*.service -> cloud-final.service -> multi-user.target`
to
`cloud-init.target -> worker-*.service -> cloud-final.service -> multi-user.target`.
Thus resolving the loop.
Ansible on fedora 40 seems broken, the default python 3.12 interpreter
doesn't work, 3.10 works but then the dnf module breaks.
Use 3.10 and stop using the dnf module.
With the rpmcopy or rpmrepo_osbuild tags, the `Install worker rpm` stage
got skipped on RHEL and CI. Invert the tag logic and use `--tags`
instead of `--skip-tags`.
This reverts commit 484c82ce55.
The AWS sdk fails to get the instance identity document when the proxy
is configured. The proxy will need to be configured explicitly for the
depsolve job and osbuild (sources) job.
If the /tmp/cloud_init_vars contained OSBUILD_EXECUTOR_CLOUDWATCH_GROUP
variable set, the worker configuration file would contain a line with
escaped newline character at the end of the value configuring
`cloudwatch_group` for the `osbuild_executor`. This makes the worker
fail to start when loading the configuration.
Remove the newline from the value appended to the worker config by the
initialization script.
Fix#4001
Signed-off-by: Tomáš Hozza <thozza@redhat.com>
Set the 'cloudwatch_group' value in the worker configuration if provided
in /tmp/cloud_init_vars, so that it is used by the worker when spinning
up an osbuild-executor instance.
Signed-off-by: Tomáš Hozza <thozza@redhat.com>
The /tmp/cloud_init_vars is not created on the worker executor, so
sourcing it will make the script fail. Comment the line out, until we
change the worker implementation to inject this file into the worker
executor using cloud-init.
Signed-off-by: Tomáš Hozza <thozza@redhat.com>
We need to use custom IAM policy name used by the worker for
osbuild-executor on Fedora workers (in prod vs. stage). And we have the
same requirement for the CloudWatch log group used by the
osbuild-executor.
Modify the Ansible playbook used by Packer to use the values from
/tmp/cloud_init_vars if set and defaulting to the current values if not
set.
Signed-off-by: Tomáš Hozza <thozza@redhat.com>
The builder uses `/run/osbuild` as a default path for this argument. Yet
this directory doesn't exist when the builder writes the manifest. But
osbuild should own this directory, not the builder.
Furthermore `/run` is a tmpfs, so the executor might run into memory
issues if we use `/run` as the store and output directory (on the "host"
workers these are in `/var/cache`).
While `/tmp` might seem like a good candidate on RHEL, it's a tmpfs on
Fedora, so it's also to be avoided.
Don't allow unbound variables, but for the variables that are used to
determine whether or not that part of the setup should continue, default
to empty/undefined.
`authselect-compat-1.2.5-2.el9_1` package is currently missing in AWS
RHUI el9 AppStream repositories, which makes `dnf upgrade` fail on
RHEL-9.1. This is a RHUI-specific issue, since the package is available
in CDN repos.
In order to workaround the issue for now, `authselect-compat` needs to
be removed as part of the upgrade in order for it to succeed. Use
`--allowerasing` instead of just removing the issue, because this will
ensure that `authselect-compat` will be upgraded just fine, once the
issue is resolved.
Fix the issue in the CI script that builds the image using Packer, as
well as the Ansible playbook used by Packer to build the image.
Signed-off-by: Tomáš Hozza <thozza@redhat.com>