Previously, all dequeuers (goroutines waiting for a job to be dequeued) were
listening for new messages on postgres channel jobs (LISTEN jobs). This didn't
scale well as each dequeuer required to have its own DB connection and the
number of DB connections is hard-limited in the pool's config.
I changed the logic to work somewhat differently: dbjobqueue.New() now spawns
a goroutine that listens on the postgres channel. If there's a new message,
the goroutine just wakes up all dequeuers using a standard go channel.
Go channels are cheap so this should scale much better.
A test was added that confirms that 100 dequeuers are not a big deal now. This
test failed when I tried to run on it on the previous commit. I tried even 1000
locally and it was still fine.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
Removing queued_at and started_at is pretty straightforward, it wasn't needed.
Removing token might seem concerning but basically we were just pulling
the same value from DB as we were pushing there. I think there's no value in
doing that.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
The decision logic which jobs to run is quite confusing but that's how we
roll for now:
Jenkins builds RHEL images only on main
Schutzbot builds RHEL images only in PRs
Schutzbot builds Fedora images on both PRs and on main
To achieve this, the commit re-enables running Packer on main on Schutzbot.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
Arch was easy.
For passing the repository distribution and osbuild_commit (it can be
different for each distro), I decided to go in the way of ansible
inventory directories. It adds a bit of structure but I think it's
the most clean solution.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
So you don't have to pass these if packer is supposed to find them
on its own (instance profile, local profile).
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
The `API.arch` member was (mostly) used to read the name of the
architecture.
The only non-name use was for the purposes of reading RPM repositories
from the configuration, in `reporegistry.ReposByArch()`, a thin wrapper
around `reporegistry.ReposByArchName()`.
Removing the `arch` member from the API and using the new `archName`
that is set up in the API constructor lets us control the arch name that
is set without relying on a valid `distro.Arch` object being available
(which would depend on having a valid `distro.Distro` object).
Replaced all calls to `ReposByArch()` with `ReposByArchName()` which
depends on the arch and distro name strings instead of a full
`distro.Arch`.
When the host distribution is not known or supported, instead of failing
with an error, print a warning to the log and initialise the API with
the architecture name and distro name.
This enables running the weldr API on unsupported distros for
cross-distro building.
Guards against a nil arch member when initialising the store.
Most test scripts don't have any documentation regarding it's purpose,
although it can be guessed by the code. There's value in adding this
small comment.
[skip-ci]
Add an error object to the ComposeStatus.ImageStatus.
The error object contains a human-readable error reason
and optional details in the case of an error.
Temporarily disable Installer test case in the CI on RHEL-9 and CentOS
Stream 9 until https://bugzilla.redhat.com/show_bug.cgi?id=2059565 is
resolved. This test case is now consistently failing due to the
mentioned bug and makes it impossible for the CI to pass cleanly.
Signed-off-by: Tomas Hozza <thozza@redhat.com>
This manifest is intended only for internal use and is currently failing
in nightly pipelines. This will be moved to a different test script in
the future, see COMPOSER-1397.
Whitespace around operators and after commas.
No whitespace after opening and before closing brackets.
Two blank lines between top-level functions and classes.
One blank line between class methods.
Indentation fixes.
CacheState.load_cache_state_from_disk() is long and redundant.
CacheState.store_on_disk() is fine (and load_from_disk() would also be
fine) but in the absence of any other store/load sources, the
from_disk() part is also unnecessary.
CacheState.store() and CacheState.load() should be enough.
This commit adds a very in-depth test for multi-tenancy. It queues several
composes and then runs all jobs belonging to them while checking that
they are run by the correct tenant.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
Yeah, we have TestRoute. It has one issue though: It doesn't have support
for passing a custom context. One option is to extend the method with yet
argument but since it already has 9 (!!!), this seems like a huge mess.
Therefore, I decided to invent a new small library for writing API tests.
It uses structs heavily which means that adding features to it doesn't
mean changing 100 lines of code (like adding another arg to TestRoute does).
I hope that we can start using this library more in our tests as it was
designed to be very flexible and powerfule.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
This commit implements multi-tenancy. A tenant is defined based on a value
from JWT claims. The key of this value must be specified in the configuration
file. This allows us to pick different values when using multiple SSOs.
Let me explain more in depth how this works:
Cloud API gets a new compose request. Firstly, it extracts a tenant name from
JWT claims. The considered claims are configured as an array in
cloud_api.jwt.tenant_provider_fields in composer's config file. The channel
name for all jobs belonging to this compose is created by `"org-" + tenant`.
Why is the channel prefixed by "org-"? To give us options in the future. I can
imagine the request having a channel override. This basically means that
multiple tenants can share a channel. A real use-case for this is multiple
Fedora projects sharing one pool of workers.
Why this commit adds a whole new cloud_api section to the config? Because the
current config is a mess and we should stop adding new stuff into the koji
section. As the Koji API is basically deprecated, we will need to remove it
soon nevertheless.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
This is quite a hack. Basically, the mock provider copies the offline token
into rh-org-id JWT claim. This allows us to test multi-tenancy.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>