Attempt to clarify the structure of our tests. Each test case is now
encapsulated in a script in `test/cases`. Each of these scripts should
be runnable on a pristine machine and be independent of each other. It
is up to the test-orchestractor to decide if they should be run
consequtively instance, or in parallel on separate instances. Each
script can execute several tests and call whatever helper binaries
is desired. However, each case should be assumed to always run as one.
The issue comes from the fact that the PR introducing it was very old
and meanwhile the variable used for image name creation has changed.
This patch makes sure both functions are the same.
as @teg suggests in
https://github.com/osbuild/osbuild-composer/issues/888#issuecomment-662942314
this is ON by default so we can be alerted for missing cloud
credentials in CI!
If you want to disable it then -fail-local-boot=off
Note: special case the qemu image types which always need to be
booted locally.
CHANGE_ID is set for PRs, which is why it worked during the review
process, but in the master branch it is not set. BRANCH_NAME is set for
both PRs and master branch. In case of PRs it is in form `PR-<pull
request number>`.
When using random names for artifacts like AWS snapshots, or Azure
images, it becomes hard to clean them up in case of CI failure. See this
issue for more details:
https://github.com/osbuild/osbuild-composer/issues/942
This PR introduces predictable names so that we can easily determine
which artifact belongs to which PR and therefore we can decide to wipe
all resources that are not needed any more.
More specifically only those that are needed in
/cmd/osbuild-image/tests.
This patch can be merged with the previous one if we want to make sure
every commit can be built, but I'm going to keep it like this for now so
that we can easily see the changes.
explicitly specify the cluster and the default resource pool
when importing b/c the import process creates a temporary VM,
which requires a ResourcePool to provision. Same thing when
provisioning a VM.
The tests are no longer run on Travis, therefore we don't need the special
setup to run them there.
This change should also fix#929 that is probably caused due to osbuild
executed in a weird way.
Fixes#929
This is not shipped in RHEL, so use the library directly to query the IP
address. This is a massive hack, but let us revisit this after the next
release.
Signed-off-by: Tom Gundersen <teg@jklm.no>
We need the same RPMs to work equally well on a host running a beta
release (pulling beta content) as on a machine running GA (pulling GA
content). Detect this at run-time and point at the right repository.
Testing this is a bit hairy as we are building 8.3 images, but obviously
there is currently no 8.3 content at the GA URLs.
Signed-off-by: Tom Gundersen <teg@jklm.no>
`systemd-tmpfiles` will helpfully delete "old" files in /var/tmp at regular
intervals. The files installed from rpm has the timestamps from when they
were packaged, which causes some to be cleaned up when the timer triggers.
The first timer triggers 15 minutes after boot, so we were sometimes hit
by this when our CI was under load.
Fixes#839 and #862.
Signed-off-by: Tom Gundersen <teg@jklm.no>
Create a GetImageInfoCommand analogous to GetOsbuildCommand that
will adjust the PYTHONPATH for image-info on travis so that the
osbuild python module is accessible.
Previously, all resouces were created with a certain tag. When the cleanup
phase came, the Resources - List route[1] was used to get all resources with
the tag. Then, they were deleted in the right order.
Sadly, the Resources - List API has issues with listing disks. Sometimes,
it returns the virtual disks 15 minutes after they were created. As the
result, the disks have been left behind quite often and our bill was higher
than necessary.
This commit uses a different method - the Go code now knows all resource names
(see the previous commit), so it can delete all resources without listing them
using the "broken" API route.
[1]: https://docs.microsoft.com/en-us/rest/api/resources/resources/list
Prior this commit the resource names were generated in the deployment
template, so the Go code actually didn't know them. This commit generates
all names in the Go code, so they can be used in the future commits.
When osbuild fails, osbuild-composer's image tests print the output as
ugly JSON and that makes it difficult to determine why osbuild failed.
Instead, print the json in pretty format. 🦋Fixes#782.
Signed-off-by: Major Hayden <major@redhat.com>
On some environments (like RHEL gating) there's no virtualization available.
This commit adds -disable-local-boot argument to osbuild-image-tests. When
this argument is present, the local booting is skipped. This doesn't affect
the cloud booting, the test binary still tries to do that. If no credentials
are available, the fall back to local booting will be skipped if
-disable-local-boot is given.
This makes it easier to use the test binary with the `-run` argument.
Instead of the full path:
-test.run TestImages//usr/share/tests/osbuild-composer/cases/rhel_8.2-x86_64-openstack-boot.json
this only requires the actual name:
-test.run TestImages/rhel_8.2-x86_64-openstack-boot.json
When edd7b37ea added `--output-directory` to the invocation of osbuild,
it also removed `--store`.
This was a mistake: osbuild's default store is `.osbuild`, which is not
what we want. Restore the old behavior of passing a temporary directory,
but use the same for each test run.
Treating stdout and stderr separately makes it hard to match what
happened when. It's also easy to miss when `-v` is passed to the test
binary.
Print the output to stdout when osbuild fails, because the test
framework we're using does not print errors if they're too large.
Also, don't special-case exec.ExitError. Output might be useful in any
case.
Schutzbot currently runs all the same tests an all arches/distro
combinations. This will not work as we introduce image types only on
some distro/arches.
In the future we should make the image-test binary more clever, so
Schutzbot won't have to tell it which cases to run at all. For now,
simply don't fail if the specified test-case does not exist.
Signed-off-by: Tom Gundersen <teg@jklm.no>
RHEL 8 doesn't have qemu-system-x86_64 in PATH. Instead, it is in
/usr/libexec/qemu-kvm. This commit fixes that by detecting the host
distro and choosing the right binary.
-cpu host cannot be used with anything else than kvm. This commit removes
hvf and tcg because it doesn't make any sense with -cpu host.
If this causes some issues for anyone, we can revert back and remove -cpu
host.
Running qemu with -accel accel= results in the following error:
qemu-system-x86_64: -accel accel=kvm:hvf:tcg: Don't use ':' with -accel,
use -M accel=... for now instead
Qemu 4.2 deprecated the -accel accel= argument. When the arg is passed in,
qemu exists status code of 1.
This commit changes the qemu command to use the recommended way of specifying
the acceleration options.
See:
3d5e90a50b
Previously, vhd images were tested using QEMU. This commit changes that to
boot them in the actual Azure infrastructure.
Azure VMs have quite a lot of dependencies - a network interface, a virtual
network, a network security group, a public ip address and a disk. Azure CLI
and Azure Portal handle the creation of all these resources internally.
However, when using the API, the caller is responsible to create all these
resources before creating an actual VM.
To handle the creation of all the resources in the right order, a deployment
is used. A deployment is a set of resources defined in a JSON document.
It can optionally take parameters to customize each deployment. After the
deployment is finished, the VM is up and ready to be tested using SSH.
Sadly, the deployments are a bit hard to clean-up. One would expect that
deleting a deployment removes all the deployed resources. However, it doesn't
work this way and therefore it's needed to clean up all resources "manually".
For this reason, our deployment sets a unique tag on all the resources created
by the deployment. After this test is finished, the API is queried for all
the resources with the tag and then, they're deleted in the right order.
The cmd/osbuild-image-tests package is becoming bigger than I would like to.
It will be nice to split it to some smaller pieces at some point.
This commit does the first step - splits off the first subpackage containing
all the constants.
This commits enables the parallelism for the image tests. However, there's
a catch. Osbuild cannot be reliably run in parallel, so the code uses
a mutex to ensure there's always only one osbuild instance for now. Even
with this limitation, there's a significant speed-up of the tests:
Prior this commit, the image tests run in 40 minutes on Travis. After this
commit, the time is reduced to 32 minutes.
The speed-up will have an even bigger effect when more cloud-upload tests are
added to the test suite.
Speed up the boot tests by allowing qemu to use all of the available
CPUs on the system. Our CI VMs have at least 2 CPUs and this shortens
the time required for a boot test.
Signed-off-by: Major Hayden <major@redhat.com>
The osbuild-image-tests don't currently support boot test for any
alternative architecture because the qemu-system-x86_64 command is
hardcoded. This patch introduces a branch specific to aarch64, but
without a KVM support as I was unable to make it run in Beaker, which is
currently the only offering we have with ARM machines. As a workaround
the boot tests will be skipped if kvm kernel module is not found and only
image-info tests will run.
Prior this commit the ami images were tested locally using qemu. This does
not reflect at all how they're used in practice. This commit introduces
the support for running them in the actual AWS. Yay!
The structure of code reflects that we want to switch to osbuild-composer
to build the images soon.
It's not very clear that the constants are indeed constants. This commit moves
them to a new struct. This way it should be more clear that those values are
constants.
Soon, images will be run non-locally (AWS, Azure). For the remote ones it's
potentially dangerous to use the publicly available key-pair. This change
prepares the codebase for specifying different keys than the pre-generated
one.
Soon, images will be run non-locally (AWS, Azure). For those boot types
there's no need to have an unshared network namespace. This commit prepares
the code for that.