Commit graph

187 commits

Author SHA1 Message Date
Jakub Rusz
db0e6c9643 Packer: change fedora-38 aarch64 ami
This ami is currently broken, switch to a slightly older one.
2024-01-31 10:11:50 +01:00
Sanne Raymaekers
e289b763e7 templates/packer: deal with unbound variables
Don't allow unbound variables, but for the variables that are used to
determine whether or not that part of the setup should continue, default
to empty/undefined.
2024-01-30 21:41:31 +01:00
Diaa Sami
6cfa26399f composer: use logrus hook instead of k8s sidecar
for splunk log forwarding
Fixes COMPOSER-2051
2023-11-28 12:42:00 +01:00
Gianluca Zuccarelli
3fe36d0012 templates/packer: configure pulp creds on startup 2023-11-07 10:48:00 +01:00
Sanne Raymaekers
bf3ff40a65 dashboards: drop interval from composer dashboard and fix slo
The latency budget remaining used $__range instead of the 28d constant.
2023-10-03 11:48:37 +02:00
Sanne Raymaekers
f05a5b59f3 dashboards: drop API section from worker job stats dashboard
Renames the worker dashboard to worker job stats dashboard.

Drops the interval variable and relies solely on $__range and
$__rate_interval.
2023-10-03 11:48:37 +02:00
Sanne Raymaekers
1475e216d2 dashboards: add worker api dashboard
Also this one is made without a separate interval variable, instead
relying on $__rate_interval and $__interval.
2023-10-03 11:48:37 +02:00
Sanne Raymaekers
33f9a6726e dashboards: fix composer dash request rate errors 2023-10-02 18:50:37 +02:00
Sanne Raymaekers
9d7159dab3 templates/packer: retry subscribtion 2023-09-25 11:56:42 +02:00
Sanne Raymaekers
0dc1a01077 templates/packer: configure oracle cloud credentials on startup 2023-09-22 09:55:48 +02:00
Sanne Raymaekers
3a9bcded32 templates/composer: fix cpu request/limits
The fluentd sidecar had the same request/limit as the service container,
and the migrate init-container had the fluentd request/limit. It should
be the other way round.
2023-09-21 12:41:06 +02:00
Sanne Raymaekers
5bb9d414a2 templates/compose: add startingDeadlineSeconds to maintenance job
The job won't run if it doesn't get scheduled within 30 minutes. This
prevents the job running multiple times in a row if it didn't get
scheduled, for instance due to resource limits.
2023-09-21 12:41:06 +02:00
Sanne Raymaekers
e0b2455acf templates/composer: parameterise maintenance job cpu req/limit 2023-09-21 11:11:35 +02:00
Sanne Raymaekers
715bdba1bf dashboards/worker: default to showing the past 6 hours
The worker dashboards contains slow queries, running these on 28 days of
data take a very long time (and they often time out).
2023-08-24 17:01:23 +02:00
Ondřej Budai
ba417dbf3d packer: use gp3 volumes
GP3 is cheaper than GP2, let's switch to it for storing our images:
https://fedoraproject.org/wiki/Changes/CloudEC2gp3

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2023-07-21 12:20:47 +02:00
Ondřej Budai
b461e403ef packer: move Fedora to 38
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2023-07-21 12:20:47 +02:00
Sanne Raymaekers
a2c07ea83a templates/dashboards: rework composer dashboard
splits the board into 3 sections:
- SLO
- API throughput
- API latency

It's also possible to filter by tenant. And some colours were adjusted
to improve readability.
2023-06-30 11:06:51 +02:00
Sanne Raymaekers
170feba87b templates/packer: use RH SSO for the default token endpoint
MAS SSO (identity.api.openshift.com) was deprecated, RH SSO should be
the default.
2023-06-29 11:32:44 +02:00
Sanne Raymaekers
38093100e3 templates/composer: No longer accept MAS SSO 2023-06-29 11:32:44 +02:00
Diaa Sami
8398f27742 internal/cloudapi: additional prometheus listener
Listening on another port, while keeping the existing endpoint until
transition is complete
2023-06-07 17:05:32 +02:00
Sanne Raymaekers
53198bed6e templates/composer: fix fluentd requests/limits
No separate request for memory was defined in #3472, only cpu
request/limit.
2023-06-05 16:16:18 +02:00
Sanne Raymaekers
3faab2f102 templates/composer: add separate CPU request/limit for sidecar 2023-06-05 11:51:36 +02:00
Sanne Raymaekers
0ddbee11cd templates/composer: parametrise replicas 2023-06-05 11:51:36 +02:00
Ondřej Budai
dce2ced50b packer: bump the amazon plugin to 1.2.3
Since the previous commit removed the associate_public_ip_address, we should
not be hitting the new behaviour introduced in 1.2.3, thus everything will
hopefully work as before.
2023-05-05 11:07:05 +02:00
Ondřej Budai
a2a5618149 packer: remove associate_public_ip_address
The documentation for this option says the following:

> If using a non-default VPC, public IP addresses are not provided by default.
> If this is true, your new instance will get a Public IP. default: unset

We don't specify a VPC in the packer build, thus we are using the default
one. Therefore, I don't think we actually need this option as it's useful
only for non-default VPCs.

See
https://developer.hashicorp.com/packer/plugins/builders/amazon/ebs#run-configuration

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2023-05-05 11:07:05 +02:00
Ondřej Budai
edf4f7e879 packer: pin the version of the amazon plugin to 1.2.2
Version 1.2.3 made changes to how the plugin handles auto-selection of a
subnet when it's not specified, see

f1ec287c77

Sadly, the new algorithm selects us-east-1e for us that doesn't support
the machine types we use (c6*.large) which causes the build to fail.
I reported it here:
https://github.com/hashicorp/packer-plugin-amazon/issues/368

One workaround might be to pin a working subnet, but that's apparently also
broken in 1.2.3, see
https://github.com/hashicorp/packer-plugin-amazon/issues/367

Therefore, I decided to pin the plugin to 1.2.2 for now, and see what's
the recommended approach from terraform guys.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2023-04-20 13:02:34 +02:00
Sanne Raymaekers
a2a3a2602c templates/dashboards/worker: add arch label to job wait duration
Display the wait duration of jobs per architecture.
2023-03-21 12:34:09 +01:00
Sanne Raymaekers
b13865d361 templates/dashboards/worker: edit thresholds
95th percentile duration is now a fixed colour, as it's tricky to get
dynamic thresholds based on the job type.

Budget remaining thresholds are now only green at infinity, turn yellow
below 4 weeks, and turn red when budget consumption would only last 3
weeks (out of 4).
2023-03-21 12:34:09 +01:00
Sanne Raymaekers
63d5132aa6 templates/dashboards/worker: change panel alignment
This aligns vertical dividers between panels across rows.
2023-03-21 12:34:09 +01:00
Sanne Raymaekers
865bb98034 templates/dashboards/worker: bump version 2023-03-21 12:34:09 +01:00
Sanne Raymaekers
5a9f8d3457 templates/dashboards/worker: show request throughput per path 2023-03-21 12:34:09 +01:00
Sanne Raymaekers
26a521f54d templates/dashboards/worker: use jobtype variable for job stats
This removes the rows of panels per job type, and uses the jobtype
variable.
2023-03-21 12:34:09 +01:00
Sanne Raymaekers
5d2f84cb9e templates/dashboards/worker: add target duration 2023-03-21 12:34:09 +01:00
Sanne Raymaekers
0b7e94b097 templates/dashboard/worker: add job type variable 2023-03-21 12:34:09 +01:00
Sanne Raymaekers
a08eb69b2e templates/packer/ansible: fix enabling cdn repos on aarch64 2023-03-03 17:58:49 +01:00
Sanne Raymaekers
c1032f31e4 templates/packer/ansible: fix unregister
The community redhat_subscription module calls `subscription-manager
unsusbscribe`, which doesn't exist. Use shell for now.
2023-03-03 17:58:49 +01:00
Sanne Raymaekers
ca8a05bd3a templates/packer: subscribe packer machines
To avoid a mismatch between the RPMs (which are build using CDN content)
and the packer instances (RHUI, which might be older).
2023-03-03 13:00:05 +01:00
Sanne Raymaekers
0096ff3689 Revert "Packer: workaround missing authselect-compat-1.2.5-2.el9_1 in RHUI repos"
This reverts commit 0a4a75e19e.
2023-03-01 20:05:38 +01:00
Tomáš Hozza
0a4a75e19e Packer: workaround missing authselect-compat-1.2.5-2.el9_1 in RHUI repos
`authselect-compat-1.2.5-2.el9_1` package is currently missing in AWS
RHUI el9 AppStream repositories, which makes `dnf upgrade` fail on
RHEL-9.1. This is a RHUI-specific issue, since the package is available
in CDN repos.

In order to workaround the issue for now, `authselect-compat` needs to
be removed as part of the upgrade in order for it to succeed. Use
`--allowerasing` instead of just removing the issue, because this will
ensure that `authselect-compat` will be upgraded just fine, once the
issue is resolved.

Fix the issue in the CI script that builds the image using Packer, as
well as the Ansible playbook used by Packer to build the image.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2023-01-24 15:40:02 +01:00
Gianluca Zuccarelli
5aae10c951 templates/dashboards: update worker queries
The workers now use a new metric to record all
http requests. This commit updates the worker dashboard
to use the new `image_builder_worker_request_count`
query.
2023-01-09 16:52:16 +01:00
Gianluca Zuccarelli
50237e3797 templates/dashboards: update composer queries
osbuild-composer now uses a new metric to record all
http requests. This commit updates the composer dashboard
to use the new `image_builder_composer_request_count`
query.
2023-01-09 16:52:16 +01:00
Sanne Raymaekers
81a5ff1bf6 templates/packer: triple aws polling attempts
AMIs can take a long time to get ready.
2022-12-14 17:10:13 +01:00
Sanne Raymaekers
86c3036fe3 templates/packer: increase polling delay
A packer build failed due to being rate limited by the aws api.
2022-12-13 13:55:53 +01:00
Tomáš Hozza
6ae8904f5a templates/packer: add comment to get_aws_creds.sh
Add a comment explaining why it is important to set the AWS bucket in
the worker configuration, even if the `AWS_ACCOUNT_IMAGE_BUILDER_ARN` is
empty.
2022-10-11 13:23:18 +02:00
Tomáš Hozza
09daa75adf templates/packer: set the GCP bucket in the worker configuration
Similar to AWS, set the GCP bucket in the worker configuration.
2022-10-11 13:23:18 +02:00
Diaa Sami
5ffb9e693e tools/appsre: remove monit setup code & scripts
Since it doesn't not work since we moved workers to app-sre
2022-10-04 16:26:08 +02:00
Ondřej Budai
f25dca793d packer: remove Fedora 35
Our workers already run on Fedora 36 so there's no need to build F35 anymore.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-09-30 14:52:24 +02:00
Diaa Sami
98eda72499 templates/packer: update amazon plugin 2022-09-27 10:47:32 +02:00
Diaa Sami
06fbd926ae app-sre: Update AMIs to rhel-9.0 2022-09-27 10:47:32 +02:00
Sanne Raymaekers
5c12076b4f templates/packer: Allow token url to be set by cloud-init vars
Hardcoding the token url renders the image useless if it ever needs to
be changed.
2022-09-22 14:15:26 +02:00