debian-forge-composer

Author	SHA1	Message	Date
Tom Gundersen	367444635a	containers/composer: terminate composer first Composer may depend on dnf-json and the worker to shut down cleanly.	2022-03-22 14:17:37 +01:00
Tom Gundersen	c3d66b5a33	cmd/composer: gracefully shut down on SIG{INT,TERM} Call `Shutdown()` on all http servers. This means we will finish processing any pending requests (including depsolving), but we will not listen to new ones. In particular, we will not answer to the readiness probe, so no new traffic will be routed to this container. Once all pending requests have been handled composer will shut down gracefully and the liveness probe will return failure. Note that in order for this to work correctly no requests should ever take longer than the shutdown timeout (by default 30s).	2022-03-22 14:17:37 +01:00
Tom Gundersen	d3cd3197c0	container: make liveness probe independent of webserver Currently liveness and readiness was treated the same. However, their behaviour at shutdown is meant to be different. When a service is not read no new connections are made to it, and when a service is not live it can be cleaned up. By considering our service live if and only if it listens to HTTP requests we don't have the opportunity to clean up after we stop listening to new requests. Leave readiness probes as they are, and instead use a file in the filesystem to indicate when the service is live. It is created before composer is spawned and deleted once composer exits.	2022-03-22 14:17:37 +01:00
Jakub Rusz	15c2044b3c	tests/upgrade: update gpg key We need to use a new gpg key after the SHA-1 deprecation. Also don't fail immediately on compose failure to be able to retrieve logs from the test VM.	2022-03-22 10:54:30 +01:00
Ondřej Budai	67e55eaea8	gitlab: run containerbuild on RHEL Otherwise, we're running into https://bugzilla.redhat.com/show_bug.cgi?id=2065292 and when I tried implementing a workaround, I ran into https://bugzilla.redhat.com/show_bug.cgi?id=1897579 Gah. Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-21 16:45:49 +01:00
Ondřej Budai	99aad294dd	deploy: work around a podman bug in CS8 See the comment. Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-21 16:45:49 +01:00
Sanne Raymaekers	f0a17d19f0	templates/composer: Add stage service accounts owner	2022-03-21 12:57:32 +01:00
Jakub Rusz	46a79a48da	workflows: Fix Gitlab CI trigger + revert debug Previous implementation added single quotes to the git command which made it not trigger the Gitlab CI at all. Changing it to clasic bash if condition.	2022-03-21 10:42:28 +01:00
Sanne Raymaekers	2023f7731d	worker: Support client_credentials grant type in client This will allow us to use the service accounts which work against identity.api.openshift.com. These are much easier to manage, especially with the new multi-tenancy, as there's a single page to create/expire them across an account. They also have the added benefit of not expiring automatically when they're not used like offline tokens, and immediate expiration when desired.	2022-03-21 09:43:43 +01:00
Sanne Raymaekers	8900bcec40	worker: Client lazy token refresh	2022-03-21 09:43:43 +01:00
Sanne Raymaekers	8a6d6ed6cf	worker: Clean up worker client config	2022-03-21 09:43:43 +01:00
Jakub Rusz	eb4c9be168	workflows: debug Gitlab CI trigger	2022-03-18 12:59:40 +01:00
Sanne Raymaekers	815d0ad65b	osbuild-worker: Log unexpected dnf-json errors These errors result in a 5xx status for the depsolve job, marked as internal failure, it's useful to log them.	2022-03-18 10:14:06 +01:00
Ondřej Budai	9ca74694a7	packer: use unique name tag for Fedora workers Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-16 12:58:05 +01:00
Tomas Hozza	e5595667bc	test/api.sh: move the DB dump to the cleanup() function Previously, the DB was not dumped in case the compose failed. Ensure that the DB is dumped before the script exits in any case. Signed-off-by: Tomas Hozza <thozza@redhat.com>	2022-03-16 09:03:47 +00:00
Tomas Hozza	e8a347d1e8	test/api.sh: do not use `/tmp`, but `$WORKDIR` Do not create files directly in `/tmp`, but use `$WORKDIR`, which is a temporary directory for transient files, which gets cleaned up when the test case finishes. Without this change, running `api.sh` twice fails the second time. Signed-off-by: Tomas Hozza <thozza@redhat.com>	2022-03-16 09:03:47 +00:00
Antonio Murdaca	b2d18166de	test/data/manifests: regenerate Signed-off-by: Antonio Murdaca <runcom@linux.com>	2022-03-14 17:31:40 +01:00
Antonio Murdaca	5f2ad326a6	internal/distro/rhel{86,90}: drop console kargs from raw image deployment Using the simplified installer we were experiencing slow system boots. Turns out we're incurring into https://bugzilla.redhat.com/show_bug.cgi?id=1839923 This patch just drops the console kargs - to be aligned with the anaconda installer that doesn't experience this slow down. The slow down doesn't happen on virtual machines as there's always a ttyS0 there Signed-off-by: Antonio Murdaca <runcom@linux.com>	2022-03-14 17:31:40 +01:00
Gianluca Zuccarelli	19e2fb7fb5	template: composer dashboard queries Tidy up the queries for the composer dashboard and making them more readable in grafana. Additionally add some fallback values for when empty query results are returned from prometheus.	2022-03-14 16:11:05 +01:00
Gianluca Zuccarelli	1f2fd8cb76	templates: worker depsolve error display Fix the display of the depsolve error rate panel. The panel had an incorrect min value of 3 (or 300%).	2022-03-14 16:11:05 +01:00
Jakub Rusz	c91131ee0c	github workflows: modify Gitlab CI trigger In `5e639cba6f` the context of the Trigger Gitalb CI workflow changed and the context "github.event.pull_request.draft" is no longer available so the condition for SKIP_CI didn't work. This can be fixed by getting the variable in the previous workflow and passin it as artifact. Docs: https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#using-data-from-the-triggering-workflow	2022-03-14 14:40:23 +02:00
Jakub Rusz	d8ea259f8b	ci: run ci_details.sh in before_script This is a nice script showing potentially useful details about the runner so let's execute it at the begining of each job.	2022-03-14 14:24:59 +02:00
Ondřej Budai	418ae32cf8	packer: fix the secret ID variable in get_koji_creds.sh Oops, we should probably start testing this. Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-14 10:27:28 +01:00
Ondřej Budai	424a741de6	packer: make subscribing optional We don't want to subscribe Fedora. Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-13 22:31:40 +01:00
Ondřej Budai	c46376aea2	packer: add support for koji credentials Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-13 09:08:11 +01:00
Ondřej Budai	2dd5ae7bca	packer: skip retrieving of creds if their ARN is not specified So we can have workers without public cloud creds. Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-13 09:08:11 +01:00
Ondřej Budai	4c0ba50ea1	packer: remove config tinkering from worker_service.sh Let's set each cloud section of the config in the respective cloud script. Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-13 09:08:11 +01:00
Ondřej Budai	2813507ac9	packer: split worker_external_creds.sh into one script per cloud Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-13 09:08:11 +01:00
Ondřej Budai	2e7815bf53	packer: move worker-config creation to ansible I think it untangles the initialization a bit and allows me to do some more refactorings. Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-13 09:08:11 +01:00
Tom Gundersen	2a4d4c4d49	dnf-json: use the default connection timeout By default `timeout` is 30 seconds, but we had it set to 5. Drop the override and use the default. This has two effects: it increases the time before we give up on connecting (as it says on the tin), and it also increases the time download has to be slow for before we give up. Internally, we were seing failures in downlaoding metadata from ODCS and similar issues have occurred in CI too. The potential downside to this is in case of having several mirrors this means it takes longer before giving up on a bad one and trying a better one. But slow is better than broken, so for now rever to the default behavior. Signed-off-by: Tom Gundersen <teg@jklm.no>	2022-03-12 09:09:13 +01:00
Tomas Hozza	562225af4c	osbuild-pipeline: use repo name from the request if provided Almost all repo configurations used for generating image test cases using `osbuild-pipeline` have `name` defined. Make sure that the repo name provided in the compose request is used when depsolving. Signed-off-by: Tomas Hozza <thozza@redhat.com>	2022-03-12 08:36:40 +01:00
Tomas Hozza	13a9022fd8	rpmmd: rename `toDNFRepoConfig()` argument `i` -> `repoID` Rename the method argument name to make its purpose obvious. Signed-off-by: Tomas Hozza <thozza@redhat.com>	2022-03-12 08:36:40 +01:00
Tomas Hozza	180290d016	dnf-json: use repository `name` from the request if provided Signed-off-by: Tomas Hozza <thozza@redhat.com>	2022-03-12 08:36:40 +01:00
Tomas Hozza	43dafe87fb	rpmmd: pass repo name to `dnf-json` The repo name is already part of the `rpmmd.RepoConfig` structure. Do not ignore when calling `dnf-json` and and pass it the value. Signed-off-by: Tomas Hozza <thozza@redhat.com>	2022-03-12 08:36:40 +01:00
Tomas Hozza	f9d0412316	dnf-json: do not use `reponame` as `repoid`. Repo name defaults to the repo ID if the name is not set. `dnf-json` should not rely on the `reponame` being set to the ID and intsead return the actual `repoid`. Signed-off-by: Tomas Hozza <thozza@redhat.com>	2022-03-12 08:36:40 +01:00
Ondřej Budai	c4c7f44fcb	dbjobqueue: reimplement the jobqueue to use only one listening connection Previously, all dequeuers (goroutines waiting for a job to be dequeued) were listening for new messages on postgres channel jobs (LISTEN jobs). This didn't scale well as each dequeuer required to have its own DB connection and the number of DB connections is hard-limited in the pool's config. I changed the logic to work somewhat differently: dbjobqueue.New() now spawns a goroutine that listens on the postgres channel. If there's a new message, the goroutine just wakes up all dequeuers using a standard go channel. Go channels are cheap so this should scale much better. A test was added that confirms that 100 dequeuers are not a big deal now. This test failed when I tried to run on it on the previous commit. I tried even 1000 locally and it was still fine. Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-11 16:04:52 +01:00
Ondřej Budai	c8dbe0de74	dbjobqueue: remove unused variables from Dequeue Removing queued_at and started_at is pretty straightforward, it wasn't needed. Removing token might seem concerning but basically we were just pulling the same value from DB as we were pushing there. I think there's no value in doing that. Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-11 16:04:52 +01:00
Sanne Raymaekers	318a4525c6	cmd/osbuild-worker: dnf-json returns MarkingErrors (plural)	2022-03-11 10:13:27 +01:00
Feng Huang	c64eb98011	use app-sre packer image Signed-off-by: Feng Huang <fehuang@redhat.com>	2022-03-11 09:24:26 +01:00
Ondřej Budai	72de1b3bbe	packer: don't save the AMIs on PRs This should save us a ton of resources as we don't use AMIs from PRs. Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-11 09:06:43 +01:00
Ondřej Budai	ad15179faf	packer: build Fedora images The decision logic which jobs to run is quite confusing but that's how we roll for now: Jenkins builds RHEL images only on main Schutzbot builds RHEL images only in PRs Schutzbot builds Fedora images on both PRs and on main To achieve this, the commit re-enables running Packer on main on Schutzbot. Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-11 09:06:43 +01:00
Ondřej Budai	ec070612ff	packer: remove RHEL and x86_64-specific bits Arch was easy. For passing the repository distribution and osbuild_commit (it can be different for each distro), I decided to go in the way of ansible inventory directories. It adds a bit of structure but I think it's the most clean solution. Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-11 09:06:43 +01:00
Ondřej Budai	cd394bf67d	packer: add default to aws auth variables So you don't have to pass these if packer is supposed to find them on its own (instance profile, local profile). Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-11 09:06:43 +01:00
Ondřej Budai	4ae71d3f3d	packer: move all RHEL-specific options to a source block Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-11 09:06:43 +01:00
Ondřej Budai	22ec89f956	packer: add more tags identifying the image Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-11 09:06:43 +01:00
Ondřej Budai	7301ea6b9d	packer: use newer (=faster) instances Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-11 09:06:43 +01:00
Ondřej Budai	8664c1449a	packer: reuse the build user for the ansible provisioner We want to build multiple images at once and some of them use a different user. Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-11 09:06:43 +01:00
Ondřej Budai	e45578d3b0	packer: remove the ami_id variable We want to build multiple images at once so they have to be defined elsewhere. Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-11 09:06:43 +01:00
Ondřej Budai	5ecbfbad9e	packer: rename composer.pkr.hcl to worker.pkr.hcl Signed-off-by: Ondřej Budai <ondrej@budai.cz>	2022-03-11 09:06:43 +01:00
Achilleas Koutsou	e5675efc4a	github: fix job names and IDs for the tests workflow Flip the incorrect flip that happened in `e4baddfad1`	2022-03-10 10:54:20 +01:00

1 2 3 4 5 ...

3690 commits