Call `Shutdown()` on all http servers. This means we will finish processing
any pending requests (including depsolving), but we will not listen to new
ones.
In particular, we will not answer to the readiness probe, so no new traffic
will be routed to this container.
Once all pending requests have been handled composer will shut down
gracefully and the liveness probe will return failure.
Note that in order for this to work correctly no requests should ever take longer
than the shutdown timeout (by default 30s).
Currently liveness and readiness was treated the same. However, their
behaviour at shutdown is meant to be different. When a service is not read
no new connections are made to it, and when a service is not live it can be
cleaned up.
By considering our service live if and only if it listens to HTTP requests we
don't have the opportunity to clean up after we stop listening to new requests.
Leave readiness probes as they are, and instead use a file in the filesystem to
indicate when the service is live. It is created before composer is spawned and
deleted once composer exits.
Previous implementation added single quotes to the git command which
made it not trigger the Gitlab CI at all. Changing it to clasic bash if
condition.
This will allow us to use the service accounts which work against
identity.api.openshift.com. These are much easier to manage, especially
with the new multi-tenancy, as there's a single page to create/expire
them across an account.
They also have the added benefit of not expiring automatically when
they're not used like offline tokens, and immediate expiration when
desired.
Previously, the DB was not dumped in case the compose failed. Ensure
that the DB is dumped before the script exits in any case.
Signed-off-by: Tomas Hozza <thozza@redhat.com>
Do not create files directly in `/tmp`, but use `$WORKDIR`, which is a
temporary directory for transient files, which gets cleaned up when the
test case finishes. Without this change, running `api.sh` twice fails
the second time.
Signed-off-by: Tomas Hozza <thozza@redhat.com>
Using the simplified installer we were experiencing slow system boots.
Turns out we're incurring into https://bugzilla.redhat.com/show_bug.cgi?id=1839923
This patch just drops the console kargs - to be aligned with the
anaconda installer that doesn't experience this slow down.
The slow down doesn't happen on virtual machines as there's always a
ttyS0 there
Signed-off-by: Antonio Murdaca <runcom@linux.com>
Tidy up the queries for the composer dashboard
and making them more readable in grafana. Additionally
add some fallback values for when empty query results
are returned from prometheus.
By default `timeout` is 30 seconds, but we had it set to 5. Drop
the override and use the default.
This has two effects: it increases the time before we give up on
connecting (as it says on the tin), and it also increases the time
download has to be slow for before we give up.
Internally, we were seing failures in downlaoding metadata from ODCS
and similar issues have occurred in CI too.
The potential downside to this is in case of having several mirrors
this means it takes longer before giving up on a bad one and trying
a better one. But slow is better than broken, so for now rever to
the default behavior.
Signed-off-by: Tom Gundersen <teg@jklm.no>
Almost all repo configurations used for generating image test cases
using `osbuild-pipeline` have `name` defined. Make sure that the repo
name provided in the compose request is used when depsolving.
Signed-off-by: Tomas Hozza <thozza@redhat.com>
The repo name is already part of the `rpmmd.RepoConfig` structure. Do
not ignore when calling `dnf-json` and and pass it the value.
Signed-off-by: Tomas Hozza <thozza@redhat.com>
Repo name defaults to the repo ID if the name is not set. `dnf-json`
should not rely on the `reponame` being set to the ID and intsead
return the actual `repoid`.
Signed-off-by: Tomas Hozza <thozza@redhat.com>
Previously, all dequeuers (goroutines waiting for a job to be dequeued) were
listening for new messages on postgres channel jobs (LISTEN jobs). This didn't
scale well as each dequeuer required to have its own DB connection and the
number of DB connections is hard-limited in the pool's config.
I changed the logic to work somewhat differently: dbjobqueue.New() now spawns
a goroutine that listens on the postgres channel. If there's a new message,
the goroutine just wakes up all dequeuers using a standard go channel.
Go channels are cheap so this should scale much better.
A test was added that confirms that 100 dequeuers are not a big deal now. This
test failed when I tried to run on it on the previous commit. I tried even 1000
locally and it was still fine.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
Removing queued_at and started_at is pretty straightforward, it wasn't needed.
Removing token might seem concerning but basically we were just pulling
the same value from DB as we were pushing there. I think there's no value in
doing that.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
The decision logic which jobs to run is quite confusing but that's how we
roll for now:
Jenkins builds RHEL images only on main
Schutzbot builds RHEL images only in PRs
Schutzbot builds Fedora images on both PRs and on main
To achieve this, the commit re-enables running Packer on main on Schutzbot.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
Arch was easy.
For passing the repository distribution and osbuild_commit (it can be
different for each distro), I decided to go in the way of ansible
inventory directories. It adds a bit of structure but I think it's
the most clean solution.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
So you don't have to pass these if packer is supposed to find them
on its own (instance profile, local profile).
Signed-off-by: Ondřej Budai <ondrej@budai.cz>