API
---
Allow the user to pass the CA public certification or skip the verification
AWSCloud
--------
Restore the old version of newAwsFromCreds for access to AWS
Create a new method newAwsFromCredsWithEndpoint for Generic S3 which sets the endpoint and optionally overrides the CA Bundle or skips the SSL certificate verification
jobimpl-osbuild
---------------
Update with the new parameters
osbuild-upload-generic-s3
-------------------------
Add ca-bunlde and skip-ssl-verification flags
tests
-----
Split the tests into http, https with certificate and https skip certificate check
Create a new base test for S3 over HTTPS for secure and insecure
Move the generic S3 test to tools to reuse for secure and insecure connections
All S3 tests now use the aws cli tool
Update the libvirt test to be able to download over HTTPS
Update the RPM spec
Kill container with sudo
When a test script fails in CI, it's often difficult to pinpoint the
exact line in the log where the script failed and the cleanup() function
(trapped on EXIT) begins.
Adding a prominent line (with greenprint where available) at the start
of the cleanup function will make reading logs of failed jobs a lot
easier.
Something odd is happening with the package check and it keeps failing
mysteriously even though the package is clearly in the list.
Changing the verification method to extract `passwd` and `packages` from
the image info file into separate files and grepping those seems to
work.
Reasons for this change:
- Mixed versions of composer and worker aren't a realistic use-case for
the weldr API (on prem) but we do run mixed versions in hosted IB, so
this test is closer to real world scenarios.
- The cloud API runs depsolve jobs in the worker, whereas the weldr API
runs them in composer. By testing the cloud API we also test the
backwards compatibility of the depsolve job.
The change requires osbuild-worker v51 or newer to be able to handle
depsolve and manifest jobs on the worker as well as depsolve chains.
Add support for building images for the Azure marketplace: add a
new image type "azure-rhui" that can be used to build images
tailored to the Azure marketplace.
This code is based on the corresponding image type in 8.6.
NB: does not have systemd-resovled (following RHEL 9 defaults)
We really only can have one. The one that was used for the generation
of the manifests is kept and the other one removed (although it has
newer repositories).
It seems rhel-91 qcow2 customize images are out of sync because commit
2beb707 removed the core group from the `format-request-map.json` and
some these said manifests were generated between that commit and the
one that added it back 1ff36bce9.
In these graphs p99 isn't very important. If 1% of jobs are slow that's
fine. The p50 and p95 slices are the important ones, so reorder and
recolor the duration graphs to reflect this.
The interval dictates the granularity of the graphs. As the interval
decreases, spikes and dips become more pronounced. 28 days as an
interval doesn't actually show much, reduce this to 6h by default which
is a happy medium.
scheduled cloud cleaner is skipping the default storage account for a
resource group, as this images should get removed. There can be a
situation where this images are not removed and forgotten here. Remove
this skip condition so scc checks also in this storage account.
Since the `api.sh` test case is using random GCE zone from a random GCE
region which name starts with the `GCP_REGION` CI environment variable.
Since the used region name is not known to the `cloud-cleaner`, it has
to iterate over all potential GCE regions and their zones. We can not
simply filter the VM instance name a list of instances, because any
`instances` API call requires a zone name to be provided.
Add a new internal `cloud/gcp` package method to list existing GCE
regions based on a provided filter.
We currently use a single GCP Compute region when spinning up VMs using
the imported GCE image. As a result, we are often hitting the
'IN_USE_ADDRESSES' quota limit when there are multiple CI jobs running.
Google does not allow us to increase the quota limit any more.
Change the GCP test cases to use the CI `GCP_REGION` variable to list
all GCE regions with available quota and pick a random one from the
list. The `GCP_REGION` value is used as the region name prefix when
filtering available regions. This means that if you specify an exact GCE
region, such as `us-west1`, you'll always get the same region, but if a
GCP multi-region is used, such as `us`, then a random region prefixed
with 'us' will be used.
We want to run all of the scripts in after_script even if some of them
fail. In aws we have rhui repos in the images and we don't use them on
GA RHEL so ci_details.sh fails there and cloud_cleaner does not run.
The `org.osbuild.udev.rules` stage creates custom udev rules files.
This is a full implementation of the stage and includes information
about valid operators and keys.
A small test suit to test the basic functionality and validation is
included.
Validate incoming requests with openapi3. Remove unsupported
uuid format from the openapi spec. Similarly, change url to uri as
uri is a supported format and url is not.
Co-authored-by: Ondřej Budai <obudai@redhat.com>
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
oneOf means that the body is valid against exactly ONE schema. There's an
issue with AWS EC2 upload options though: It requires region and
share_with_accounts fields. Such a request is also valid AWS S3 upload though
(this one only require region). This means that AWS EC2 upload options will be
always valid against two schemas which violates the oneOf rule.
Let's switch to anyOf and explain this in the openAPI spec.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
It was never required, never used. I honestly think that this was a copy-paste
error, I don't see any reason why a user would have an object reference.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
We are interested in the time it takes from a job could be dequeued
until it is, but if a job has dependencies that are not yet finished, it
cannot be dequeued.
Change the logic to measure the time since the last dependency was
dequeued rather than when the job was queued.
The purpose of this metric is to have an alert fire in case we have too
few workers processing jobs.
Similar to DRY_RUN, these values should be overwritten in app-interface
per namespace. At some point the maintenance specific to the CRC tenant
(aws and gcp maintenance) should run in the workers namespace rather
than the composer namespace. Granularity is needed for this.
Stage and production share the GCP account. To avoid trying to delete
each GCP image twice, the maintenance script needs the ability to
selectively disable certain parts based on the config.
Building `tar` image for `s390x` on RHEL-84 ends with panic:
"s390x image must have a partition table, this is a programming error"
A tar image should not need a partition table, so this error does not
make sense.