Commit graph

258 commits

Author SHA1 Message Date
Sanne Raymaekers
cedc351bbd templates/packer: fix installing rpms from copr
There are now 2 colons present, one separating the epoch and the
version, and one before the comment.
2025-06-20 21:57:04 +02:00
Tomáš Hozza
1fc5e2ad18 Packer: use latest RHEL-9 GA Cloud Access images for workers
Update the RHEL-9 Cloud Access images used for our workers from 9.0 to
9.6, which is the latest GA. We do upgrade all packages in our Ansible
playbook, but that is just waste of resources if we can use the latest
GA images.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2025-05-30 15:28:37 +02:00
Tomáš Hozza
73ceb94b51 Packer: update Fedora images to F42 and remove workarounds
Update Fedora workers from EOL F40 to F42.

Remove workarounds that should not be needed any more (i.e. the Packer
upstream issue has been closed).

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2025-05-30 15:28:37 +02:00
Sanne Raymaekers
c3cb3ba785 templates/packer: set wanted-by to cloud-init.target
The `cloud-init.target` in 9.6 has `After=multi-user.target` in its unit
config. The worker initialization service was set to run before
`multi-user.target`, but after `cloud-final.service`. This created an
impossible situation and systemd just disabling the initialization
service.

So this changes:
`multi-user.target -> worker-*.service -> cloud-final.service -> multi-user.target`
to
`cloud-init.target -> worker-*.service -> cloud-final.service -> multi-user.target`.

Thus resolving the loop.
2025-05-14 21:01:39 +02:00
Florian Schüller
30198922a5 templates/dashboards: increase timespan readability
Also introduces "50min." as we use this now and
shorten some titles to see which charts are affected by
the `target_duration`.
2025-03-05 10:27:54 +01:00
Sanne Raymaekers
7f2766793d templates/composer: bump rhel-9 distro alias
RHEL 9.5 is now GA.
2024-11-26 11:25:33 +01:00
Florian Schüller
1f8da3bd83 templates/dashboards/grafana: enable shared crosshair
shared crosshair makes it easier to see the affected time
in other charts/panels
2024-11-06 11:55:22 +01:00
Florian Schüller
3ff8308389 templates/dashboards/grafana: introduce number of pending jobs 2024-11-06 11:55:22 +01:00
Achilleas Koutsou
5d6a4b762b templates: enable ignore_missing_repos in openshift 2024-11-05 08:21:42 +01:00
Sanne Raymaekers
f6e82ba403 templates/packer: fix fedora 40 aarch64 base image
The old one disappeared.
2024-10-29 17:42:10 +01:00
Sanne Raymaekers
0ef0ae7c97 templates/packer: allow setting executor type in worker config
Currently the worker images always have to use aws.ec2, this way we can
use the host executor for fedora.
2024-10-28 10:51:34 +01:00
Ondřej Budai
1b169a150c packer: don't deregister old AMIs
Imagine this scenario: the packer job is ran, an AMI gets created.
We configure our deployment to use this AMI. Then, someone retries the
packer job. Since we have force_deregister=true, this will not only
create a new AMI, but also remove the old one (because it has the same
name). Thus, our deployment will get broken, because the source AMI
no longer exists. This means that the ASG cannot replace any broken
instances, and the secure instance feature gets absolutely broken
because it cannot spawn new secure instances (they "inherit" the AMI
ID from their parents).

Let's remove force_deregister=true, so the AMI never gets replaced.
This might cause some pipelines to start failing because they are
rerunning the packer job for same commit (the GA pipeline currently).
Let's fix those then, rerunning the packer job is just confusing.

If this causes some unexpected issues, we can always resort to using
unique AMI names (by appending a timestamp to their name), but having
multiple AMIs with different names, but same tags will cause our
terraform configuration to be reapplied everytime there's a rerun,
which is also not great.
2024-10-21 11:48:02 +02:00
Jakub Rusz
a54ac303a3 templates: fix apiVersion
There were errors using the latest oc 4.17 version:

error: failed to read input object (not a Template?): unable to decode
"templates/openshift/composer.yml": no kind "Template" is registered for
version "v1" in scheme "k8s.io/kubectl/pkg/scheme/scheme.go:28"
2024-10-03 16:27:21 +02:00
Florian Schüller
8d24dcfbde osbuild-worker: add CHANNEL to worker logs
aka "the deployment channel" like "staging" or "production"
2024-08-28 16:41:07 +02:00
Florian Schüller
9006836afc logrus: add deployment channel as field to the logs 2024-08-07 12:32:57 +02:00
Florian Schüller
54904d47da Change log_level for the service to json
This is to be inline with image-builder and to
enable decoding in splunk
2024-07-31 17:46:01 +02:00
Tomáš Hozza
c94b6ccde6 Templates: define 'rhel-10' distro alias
Define `rhel-10` distro alias in the OpenShift template. Even though the
same alias is defined in the default configuration, I think that it is
good to also include it in the template to not forget about it in the
future.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2024-07-17 11:02:41 +02:00
Sanne Raymaekers
786f44e7e7 templates/dashboards: human readable job duration targets
Also makes the default 40m, which is the new slo target for osbuild
jobs.
2024-07-04 12:46:19 +02:00
Sanne Raymaekers
af73f2eccf templates/packer: make set_executor_hostname executable
Prevents `worker-executor.service: Failed at step EXEC spawning
/usr/local/libexec/worker-initialization-scripts/set_executor_hostname.sh:
Permission denied`.
2024-06-26 10:56:57 +02:00
Sanne Raymaekers
2a621521a8 osbuildexecutor/aws.ec2: set hostname of executor via cloud-init
This way much more of the journal will be captured under the new
hostname.
2024-06-25 10:58:10 +02:00
Sanne Raymaekers
55439fc6d3 templates/dashboards: remove active worker count
It's misleading since it counts the amount of workers that have
registered to the current composer pods, it doesn't actually keep track
of the active workers.

Remove it and keep the worker-api stats as a proxy for active workers.
2024-06-12 17:20:01 +02:00
Sanne Raymaekers
7d7bce76c0 templates/packer: use osbuild-worker-executor 2024-06-12 11:36:30 +02:00
Sanne Raymaekers
7e89085808 templates/openshift/composer: remove maintenance cronjob
This is now deployed from a separate tempate.
2024-06-12 09:42:27 +02:00
Tomáš Hozza
607b65c67f Templates: update RHEL distro aliases
The latest GA releases are 8.10 and 9.4.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2024-06-04 13:03:37 +02:00
Sanne Raymaekers
4629a31f22 templates/packer: use python3.10 on fedora
Ansible on fedora 40 seems broken, the default python 3.12 interpreter
doesn't work, 3.10 works but then the dnf module breaks.

Use 3.10 and stop using the dnf module.
2024-05-31 13:55:58 +02:00
Sanne Raymaekers
22e15da73c templates/packer: use import_tasks instead of include_tasks
The tags don't get inherited through the dynamic `include_tasks`
command. Use `import_tasks` to preserve the tags.
2024-05-31 13:55:58 +02:00
Sanne Raymaekers
a96f1b6d31 templates/packer: switch to fedora-40
Fedora 38 is EOL, and packit no longer builds rpms for it.

The current python3.12 + ansible 2.12 combination which is the default
on fedora 40 doesn't work, so switch to python3.9.
2024-05-29 19:36:31 +02:00
Sanne Raymaekers
13aae7d532 templates/packer: invert tag logic
With the rpmcopy or rpmrepo_osbuild tags, the `Install worker rpm` stage
got skipped on RHEL and CI. Invert the tag logic and use `--tags`
instead of `--skip-tags`.
2024-05-21 09:40:11 +02:00
Sanne Raymaekers
c886d6c1f5 templates/dashboards: fix community-stage tenant variable
A space is necessary before and after the colon separating the key and
the value.

[skip ci]
2024-05-08 12:59:34 +02:00
Sanne Raymaekers
592308f7af templates/packer/ansible: add task to install rpms from copr
Split the rpmrepo tasks in osbuild and composer. With copr we'll use
osbuild from rpmrepo, because the osbuild copr rpms disappear too
quickly.
2024-05-07 13:57:48 +02:00
Sanne Raymaekers
49566b7ce4 templates/packer: add failure script
In case the service failed, set the instance to unhealthy.
2024-05-02 13:34:47 +02:00
Sanne Raymaekers
a8148f9b34 templates/openshift/maintenance: fix service account 2024-04-30 16:58:00 +02:00
Sanne Raymaekers
7901889d87 templates/openshift/maintenance: PGSSLMODE is a parameter
Parameters need to be declared.
2024-04-30 16:58:00 +02:00
Sanne Raymaekers
a87e3069a1 templates/openshift: make the maintenance template generic
We could deploy this job for both composer and each tenant's workers
that's present in app-intf. Then we can remove the maintenance bits from
the composer template.
2024-04-29 15:04:52 +02:00
Sanne Raymaekers
5a776c5b79 templates/openshift: split worker from composer maintenance 2024-04-25 17:32:21 +02:00
Sanne Raymaekers
3827f710de templates/openshift: move openshift templates to separate folder
Keep a symlink to the old composer template so the current deployment
doesn't break.
2024-04-25 17:32:21 +02:00
Sanne Raymaekers
3df0c3a631 templates/packer: fix proxy config in ldap service account init
The proxy is set to "null" currently.
2024-04-23 22:13:17 +02:00
Sanne Raymaekers
e607f3b629 dashboards/worker-general: bump version 2024-04-22 13:05:39 +02:00
Sanne Raymaekers
f6acb31dd8 dashboards/worker-general: add community-stage tenant 2024-04-22 13:05:39 +02:00
Sanne Raymaekers
2eea99d008 dashboards/worker-general: min intervals and multi tooltip mode 2024-04-22 13:05:39 +02:00
Sanne Raymaekers
10d2e272a4 dashboards/worker-general: add active worker count 2024-04-22 13:05:39 +02:00
Sanne Raymaekers
95ae8ed917 dashboards/worker-general: fix tenant query 2024-04-22 13:05:39 +02:00
Sanne Raymaekers
ac9f4a2c81 dashboards/worker-general: update schema 2024-04-22 13:05:39 +02:00
Sanne Raymaekers
b8d97b7b68 templates/composer: worker heartbeat timeout of 5m
The default timeout of 1 hour is fine for on-prem, but in the service it
makes workers seemingly stick around for way too long.
2024-04-19 19:56:25 +02:00
Sanne Raymaekers
677e30cc68 templates/packer: add proxy 2024-04-17 16:17:57 +02:00
Sanne Raymaekers
18db445745 Revert "templates/packer: set http(s)_proxy environment variabl…"
This reverts commit 484c82ce55.

The AWS sdk fails to get the instance identity document when the proxy
is configured. The proxy will need to be configured explicitly for the
depsolve job and osbuild (sources) job.
2024-04-17 16:17:57 +02:00
Sanne Raymaekers
484c82ce55 templates/packer: set http(s)_proxy environment variable in unit 2024-04-10 10:03:43 +02:00
Sanne Raymaekers
c8130d0689 templates/packer: support ldap service account for repo mtls conf
The secret needs 3 fields, the cert, key and baseurl for the
repository. The CA is optional.
2024-03-29 20:45:05 +01:00
Sanne Raymaekers
cda94f4f62 templates/packer: don't subscribe executor
All the required sources will be proxied.
2024-03-19 17:07:30 +01:00
Ondřej Budai
e5853c9aa5 Remove rhel-10.0 alias from the openshift template
We now have a proper rhel-10.0 distribution, and this alias is clashing
with it, so we are seeing the following message in production:

failed to configure distro aliases: invalid aliases: ["alias 'rhel-10.0' masks an existing distro"]

Let's fix it by removing the alias, it's obviously not needed anymore.
2024-03-15 15:29:45 +01:00