Commit graph

63 commits

Author SHA1 Message Date
Sanne Raymaekers
3827f710de templates/openshift: move openshift templates to separate folder
Keep a symlink to the old composer template so the current deployment
doesn't break.
2024-04-25 17:32:21 +02:00
Sanne Raymaekers
b8d97b7b68 templates/composer: worker heartbeat timeout of 5m
The default timeout of 1 hour is fine for on-prem, but in the service it
makes workers seemingly stick around for way too long.
2024-04-19 19:56:25 +02:00
Ondřej Budai
e5853c9aa5 Remove rhel-10.0 alias from the openshift template
We now have a proper rhel-10.0 distribution, and this alias is clashing
with it, so we are seeing the following message in production:

failed to configure distro aliases: invalid aliases: ["alias 'rhel-10.0' masks an existing distro"]

Let's fix it by removing the alias, it's obviously not needed anymore.
2024-03-15 15:29:45 +01:00
Tomáš Hozza
e561ba0854 templates/composer: set DISTRO_ALIASES for composer
Set the RHEL release names without the minor version to point to the
latest GA release. Set the 'rhel-10.0' to the latest RHEL-9 minor
release in development, so that one can start building RHEL-10 images
without referencing RHEL-9.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2024-02-21 12:06:33 +01:00
Diaa Sami
c9c51613a4 composer: glitchtip integration 2024-02-13 14:57:57 +01:00
Diaa Sami
6cfa26399f composer: use logrus hook instead of k8s sidecar
for splunk log forwarding
Fixes COMPOSER-2051
2023-11-28 12:42:00 +01:00
Sanne Raymaekers
3a9bcded32 templates/composer: fix cpu request/limits
The fluentd sidecar had the same request/limit as the service container,
and the migrate init-container had the fluentd request/limit. It should
be the other way round.
2023-09-21 12:41:06 +02:00
Sanne Raymaekers
5bb9d414a2 templates/compose: add startingDeadlineSeconds to maintenance job
The job won't run if it doesn't get scheduled within 30 minutes. This
prevents the job running multiple times in a row if it didn't get
scheduled, for instance due to resource limits.
2023-09-21 12:41:06 +02:00
Sanne Raymaekers
e0b2455acf templates/composer: parameterise maintenance job cpu req/limit 2023-09-21 11:11:35 +02:00
Sanne Raymaekers
38093100e3 templates/composer: No longer accept MAS SSO 2023-06-29 11:32:44 +02:00
Diaa Sami
8398f27742 internal/cloudapi: additional prometheus listener
Listening on another port, while keeping the existing endpoint until
transition is complete
2023-06-07 17:05:32 +02:00
Sanne Raymaekers
53198bed6e templates/composer: fix fluentd requests/limits
No separate request for memory was defined in #3472, only cpu
request/limit.
2023-06-05 16:16:18 +02:00
Sanne Raymaekers
3faab2f102 templates/composer: add separate CPU request/limit for sidecar 2023-06-05 11:51:36 +02:00
Sanne Raymaekers
0ddbee11cd templates/composer: parametrise replicas 2023-06-05 11:51:36 +02:00
Diaa Sami
5dda08a20a templates/composer.yml: update splunk port for splunk cloud
using an openshift template variable
2022-09-22 10:40:22 +02:00
Sanne Raymaekers
a221de5db7 templates/composer: Remove non-existent secret
The secret not existing causes the deployment to fail during a
validation stage.

```
[ERROR] [openshift_base.py:_validate_resources_used_exist] - [Deployment/composer] Secret db does not exist
```
2022-07-28 11:24:25 +02:00
Sanne Raymaekers
968023f950 templates/composer: Map db secrets to maintenance container 2022-06-04 12:48:17 +02:00
Sanne Raymaekers
71c78991a6 cloudapi: Drop bucket from composer config
This value is set in the worker config. In future it might also be
passed through the api to upload into target accounts, but it should
never be set in composer.
2022-06-01 12:03:12 +02:00
Ondřej Budai
34fb2b6001 templates: add Fedora prod tenant to the ACL
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-05-27 17:19:19 +01:00
Sanne Raymaekers
973b209060 templates/composer: Add resources requests/limits to db migration 2022-05-27 15:09:42 +02:00
Sanne Raymaekers
b91400fd92 templates/composer: Add podAntiAffinity rule based on hostname
Linter output:
Specify anti-affinity in your pod specification to ensure that the
orchestrator attempts to schedule replicas on different nodes. Using
podAntiAffinity, specify a labelSelector that matches pods for the
deployment, and set the topologyKey to kubernetes.io/hostname.
2022-05-27 15:09:42 +02:00
Sanne Raymaekers
a8adb59995 templates/composer: Enable specific maintenance parts
Similar to DRY_RUN, these values should be overwritten in app-interface
per namespace. At some point the maintenance specific to the CRC tenant
(aws and gcp maintenance) should run in the workers namespace rather
than the composer namespace. Granularity is needed for this.
2022-05-14 16:21:21 +02:00
Diaa Sami
5a4488c829 templates/composer: fix access to private repos
update secret name to the correct one
2022-05-12 14:49:22 +02:00
Diaa Sami
941fe3513f templates/composer: add missing fluentd-config volume 2022-05-12 14:02:00 +02:00
Sanne Raymaekers
809afbd0ad templates/composer: Specify registry for fluentd-hec image 2022-05-12 11:03:17 +02:00
Diaa Sami
631133eabb templates/composer: give access to private quay repos 2022-05-12 10:30:54 +02:00
Diaa Sami
ca83eccc47 templates/composer: add fluentd sidecar
The sidecar receives logs from the service and forwards them to Splunk
HEC
2022-05-12 10:30:54 +02:00
Sanne Raymaekers
02debc0cda templates/composer: Parametrize tenants in acl
This will allow us to specify tenants in the acl per namespace.
2022-05-10 15:40:38 +02:00
Sanne Raymaekers
11890682b7 templates/composer: Drop unused variables 2022-03-28 12:02:37 +02:00
Sanne Raymaekers
eba355bb60 templates/composer: Remove unused acl claims
This leaves fedora and consoledot tenants.
2022-03-28 11:38:48 +02:00
Ondřej Budai
fc86ffd968 container: fix liveness probe
We don't have permissions to write to /run when running on OpenShift so let's
just use /tmp and change the filename to prevent any conflicts.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-03-25 14:02:12 +01:00
Sanne Raymaekers
9368b60401 templates/composer: Add prod service accounts owner 2022-03-23 16:43:10 +01:00
Tom Gundersen
d3cd3197c0 container: make liveness probe independent of webserver
Currently liveness and readiness was treated the same. However, their
behaviour at shutdown is meant to be different. When a service is not read
no new connections are made to it, and when a service is not live it can be
cleaned up.

By considering our service live if and only if it listens to HTTP requests we
don't have the opportunity to clean up after we stop listening to new requests.

Leave readiness probes as they are, and instead use a file in the filesystem to
indicate when the service is live. It is created before composer is spawned and
deleted once composer exits.
2022-03-22 14:17:37 +01:00
Sanne Raymaekers
f0a17d19f0 templates/composer: Add stage service accounts owner 2022-03-21 12:57:32 +01:00
Ondřej Budai
2ea2e9be09 templates/composer: give access to Fedora org
We will be using both offline tokens (account_id) and service accounts
(rh-org-id) for now.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-03-08 13:06:35 +01:00
Ondřej Budai
37181eb995 templates/composer: add tenant_provider_fields
account_id is for https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token

rh-org-id is for https://identity.api.openshift.com/auth/realms/rhoas/protocol/openid-connect/token

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-03-08 12:07:00 +01:00
Sanne Raymaekers
413a013b91 templates/composer: Parametrize bucket name 2022-03-02 09:56:32 +01:00
Sanne Raymaekers
e56248d3c8 templates: Add production worker account to acl 2022-02-25 16:57:13 +01:00
Sanne Raymaekers
b05723a37e templates/composer: Verify against mass sso and rh sso 2022-02-24 09:48:12 +01:00
Sanne Raymaekers
4956e48a0b service-maintenance: Skip db cleanup
Let's enable the cloud cleanup first, and then move on to the db.
2022-02-07 20:42:45 +01:00
sanne
d08147864a osbuild-service-maintenace: Map AWS secrets 2022-01-11 12:57:02 +01:00
sanne
4797ac281a osbuild-service-maintenance: Rework GCP credentials mapping
Because of the way the gcp secrets are stored for the workers, and how
the mapping from vault to openshift works (unable to map a multiple key
secret into a single json file), there's a bit of juggling required to
get the gcp credentials in the right format.
2022-01-11 12:57:02 +01:00
sanne
60d4f5a751 composer: Disable artifacts for the service
When backed by a DB, composer has no need of a queue directory.

This also addresses "Error moving artifacts for job" logging noise.

Signed-off-by: sanne <sanne.raymaekers@gmail.com>
2021-12-16 17:04:08 +00:00
sanne
98abdf1902 templates: Max concurrent requests is required for the maintenance job 2021-12-08 10:31:33 +01:00
sanne
4224b2231b templates: CronJob is part of the batch/v1 api 2021-12-07 11:52:49 +01:00
sanne
0379cb5796 templates: Add maintenance cronjob 2021-12-06 22:51:24 +01:00
Ondřej Budai
8f0d685b70 template: bump postgres max conns to 20
We actually need 2 * 16 connections at minimum (one worker waits for two
jobs). Let's bump the maximum connection count even moar.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2021-11-19 13:25:51 +01:00
Ondřej Budai
c3a8fc19a2 templates: bump max postgres connections to 10
By default, pgxpool.Pool has 4 connections (or number of cpus if higher).
Currently, we have 3 replicas, that means max 3*4=12 DB connections.

The dequeue operation is actually blocking - when a worker is waiting for
a job, one connection is blocked. My theory is that with 16 workers, we just
don't have enough connections that causes all sorts of weird slowdowns.

This commit bumps the number of connection from one replica to 10, therefore
we should be at 30 connections in total.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2021-11-19 13:17:10 +01:00
Sanne Raymaekers
1fdc18856a Revert "templates: Add prometheus scrape annotations to composer-api"
This reverts commit 7f86dae69b.
2021-11-10 15:24:24 +01:00
sanne
7f86dae69b templates: Add prometheus scrape annotations to composer-api 2021-11-10 15:13:53 +01:00