Commit graph

278 commits

Author SHA1 Message Date
Ondřej Budai
5edac4ee58 tools/tree-diff: strip NULL character from selinux xattr 2019-10-08 21:39:35 +02:00
Ondřej Budai
fd2a20d247 tools/tree-diff: Use hash for content diffs
We need to know the exact difference of modified files in both trees.
Outputting the whole files into a diff might make a huge diff file,
therefore only their hashes are written.
2019-10-08 21:39:35 +02:00
Ondřej Budai
9fd9270c53 tools/tree-diff: List all dirs and files inside added/deleted dirs
In my opinion it is better to dump everything and then filter out
unneeded entries elsewhere.
2019-10-08 21:39:35 +02:00
Martin Sehnoutka
0862722b03 Introduce cloud-base sample
It is similar to the official Fedora cloud base image except for few
minor differences. The reason for this divergence is that we don't want
to include all hacks that are currently present in the official
kickstart file. You can see it here as a reference:
https://pagure.io/fedora-kickstarts/blob/master/f/fedora-cloud-base.ks#_149
2019-10-07 21:25:18 +02:00
Ondřej Budai
e12f55aa21 tests: print stdout from osbuild when it fails 2019-10-07 10:10:51 +02:00
Lars Karlitski
9fbe80722b assemblers: add org.osbuild.rawfs
This assembler outputs an image file which only contains the file
system.
2019-10-07 10:10:51 +02:00
Lars Karlitski
7375e4f5dd tree-diff: change shebang to respect $PATH 2019-10-07 10:10:51 +02:00
Lars Karlitski
cb2f383601 remoteloop: make LoopClient.device a context manager 2019-10-07 10:10:51 +02:00
Lars Karlitski
0dd60b3abf remoteloop: pass filename to create_device
This makes LoopClient simpler to use in the common case.
2019-10-07 10:10:51 +02:00
Lars Karlitski
356f62058f remoteloop: remove dir_fd argument in create_device
If dir_fd wasn't passed, create_device() openend it to `/dev` and forgot
about closing it. To fix this, it would have to gain logic to only close
the fd if it wasn't passed in.

Side-step the problem by removing dir_fd, since nothing is using it
right now. We can add it back if something needs it.
2019-10-07 10:10:51 +02:00
Lars Karlitski
3d3ffda5d8 remoteloop: don't close a socket it didn't open
Closing the socket is the responsibility of whoever opened it.

Fix this in the only user (qemu assembler) by using socket() in a `with`
block, which closes the socket on exit.
2019-10-07 10:10:51 +02:00
Lars Karlitski
c1dca86505 samples: remove base-from-yum.json
build-from-yum.json is the one that's being used for testing on Ubuntu.
Remove base-from-yum.json, because it's confusing to have two similarly
named pipelines like this.
2019-10-07 00:17:43 +02:00
Lars Karlitski
ff56cb7f6a test: introduce OSBUILD_TEST_STORE
The testosbuild.TestCase class creates a fresh store for each test,
because tests should run independent of each other.

This can lead to long waiting times while developing a new test case.

Allow overriding the store used with OSBUILD_TEST_STORE. This should
never be used where tests are actually run. It is a development-only
feature.
2019-10-07 00:06:23 +02:00
Martin Sehnoutka
23edc18bed sum up the procedure necessary for releasing new version 2019-10-04 22:27:06 +02:00
Lars Karlitski
434a01602b 3 2019-10-04 11:13:21 +02:00
Martin Sehnoutka
cd49e2407c replace _libdir with _prefix/lib
_libdir is platform dependant, but that is not what we want because we
would need additional runtime logic to handle platforms. this patch
overrides the defautl location
2019-10-03 15:35:50 +02:00
Ondřej Budai
f9b2da9ad3 osbuild: print tree id and output id also in non-json mode 2019-10-03 14:50:29 +02:00
Lars Karlitski
3e57f13380 stages/dnf: exclude-packages → exclude_packages 2019-10-03 12:53:01 +02:00
Tom Gundersen
72c3157162 assemblers/qemu: replace grub2-install
Background:

grub2 works in three stages:
 - The first stage is found in the first 440 bytes of the master
   boot record, and its only purpose is to load and execute the
   second stage. This stage is static, and just copied from the rpm
   without modification.
 - The second stage is found in the gap between the MBR and the
   first partition, and may be up to 31kB in size. This stage is
   specific to the host and must contain the instructions for
   finding the right file system and subdirectory for the grub2
   config and modules on the host, as well as the modules needed
   to do this.
 - The third stage is found in the `normal` module, which loads
   grub2.conf, which in turn may load more modules and perform
   arbitrary instructions.

Problem:

grub2-install is responsible for installing all these stages on the
target image. This goes against our design, as modifications outside
the filesystem should happen in the assembler, but modifications to
the filesystem should happen in a stage. In particular, we don't
want the contents of the image to differ in any way from the output
tree that is stored in our content store (the output of our last
stage). This causes a practical problem at the moment, as our
selinux stage is ran before the assembler, and as such the grub
modules do not get selinux labels applied.

It turns out that we could split grub2-install in two as we want,
by passing `--no-bootsector` to it to install only the modules,
and copy/genereta the two first stages as files under /boot and
then run `grub2-bios-setup` to write the stages from /boot into
the image where they belong.

Regrettably, this does not work as both `grub2-install` and
`grub2-bios-setup` introspect the system and block devices they
are being run on to generate the right configuration. This is not
what we want, as we would like to specifcy the config explicitly
and run them independently of the target image. The specific bug
we get in both cases is that the canonical path containing our
object store cannot be found.

Before osbuild this was not a problem, as other installers would
instal and assemble everything directly in the target image as a
loopback device. Something we explicitly do not want to do.

Solution:

This patch essentially reimplements grub2-install, or rather the
parts of it that we need. One change in behavior from the upstream
tool is that we no longer write the level one and level two boot
loaders to /boot before moving them into place, but just write them
directly where they belong (so they do not end up on the
filesystem).

The parts that copy files into /boot are now in the grub2 installer
and the parts that write the level one/two bootloaders are in the
qemu assembler.

This achieves a few principles I think we should always adher to:
 - never run tools from the target image (no chroot)
 - don't read/copy files from the target image that was written
   by other stages. We already try to avoid sharing state, and
   by treating the image as write-only, we avoid accidentally
   sharing state through the target tree.

Based-on-suggestions-from: Javier Martinez Canillas <javierm@redhat.com>
With-god-like-debugging-and-fixes-by: Lars Karlitski <lubreni@redhat.com>
Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-10-02 15:10:37 +02:00
Tom Gundersen
816d111779 assemblers/qemu/loop: open backing file O_DIRECT
This should improve performance and save memory as we don't need two
page caches.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-10-02 15:10:37 +02:00
Tom Gundersen
f470c3f3a3 assemblers/qemu: fix the partition UUID in the pipeline
Otherwise, sfdik would pick one at random. We want our images to be
reproducible to the extent possible, so we must move all randomness
out of the assemblers when we can.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-10-02 15:10:37 +02:00
Lars Karlitski
0be34c8bcc test_boot: show stderr of qemu process
We're only interested in capturing stdout. It might be useful to see
what qemu prints on stderr.
2019-10-02 15:10:37 +02:00
Martin Sehnoutka
fa8de2f6d8 move files from /usr/libexec to /usr/lib
There is no real difference in these two directories. Composer already
uses /usr/lib, so OSBuild should use the same as well.
2019-10-02 15:01:01 +02:00
Tom Gundersen
8f9dd5ec7d stages/dnf: support --exclude
This allows given packages to be excluded from the transaction. This
is useful if you want to install a group with certain exceptions.

A common thing to do in kicktstart files is:
```
rm -f /boot/*-rescue*
```

By instead excluding the dracut-rescue-config package we end up
with:
```
"deleted_files": [
  "/etc/kernel/postinst.d",
  "/usr/lib/dracut/dracut.conf.d/02-rescue.conf",
  "/usr/lib/kernel/install.d/51-dracut-rescue.install",
  "/boot/initramfs-0-rescue-ffffffffffffffffffffffffffffffff.img",
  "/boot/vmlinuz-0-rescue-ffffffffffffffffffffffffffffffff"
],
```

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-10-02 13:34:14 +02:00
Ondřej Budai
adf5989de2 osbuild/pipeline: Fix crashes when running multiple builds at once
Storytime! I tried to run multiple osbuilds at once. It failed when
unmounting the buildtree. Weird. It turned out the buildtree was not
there anymore when osbuild tried to unmount it. But who unmounted it?

We need to deep dive into mount-types.
Nowadays, the / directory is shared-mounted by systemd. See:
https://serverfault.com/questions/868682/implications-of-mount-make-private
This has interesting implications, see the following example:

we start osbuild1 with /var/tmp/os1 as its store
osbuild1 creates /var/tmp/os1/tmp
osbuild1 bind-mounts / onto /var/tmp/os1/tmp

we start osbuild2 with /var/tmp/os2 as its store
osbuild2 creates /var/tmp/os2/tmp
osbuild2 bind-mounts / onto /var/tmp/os2/tmp

Now, the shared-mounting goes into effect:
The second mount-event gets propagated into the first mount, where it
creates another mount, so we get something like this:
/var/tmp/os1/tmp/var/tmp/os2/tmp

But this is just a start! Imagine running three osbuilds at once.
The event would get propagated to those 3 mounts created by two
osbuilds, creating 3 extra mounts, 7 in total.

It turns out this mounting strategy creates an *exponential number* of
mounts. Crazy, right?

This commit mounts the root inside build root using private bind, which
doesn't propagate bind-events. This solves the problem with the
exponential growth.

But the original problem was different, mount points were disappearing.
So how does this fix solve the problem?

Honestly, I don't know. Something with mount-event propagation is
probably responsible, but I cannot imagine how it is actually affecting
the unbinding.
2019-10-02 06:20:05 +02:00
Tom Gundersen
6ed426773f stages/yum: don't name the repositories
See 840bfd580c.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-09-30 23:48:23 +02:00
Lars Karlitski
db6e933cb8 test: add docstring to osbuildtest.TestCase 2019-09-30 08:36:50 +02:00
Lars Karlitski
2205e972d3 test: move temporary store to osbuildtest.TestCase
All tests will need a store. There's no need for each to create a
temporary directory.
2019-09-30 08:36:50 +02:00
Tom Gundersen
34098bf6c6 assembler: rename qcow2 to qemu and add support for more formats
Opt in to supporting the most common ones, if we want to support more
we can add support as the need arises.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-09-29 19:05:55 +02:00
Tom Gundersen
840bfd580c stages/dnf: don't name the repositories
The names carry no information, and do not affect the produced image.
Generate them instead.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-09-29 19:04:39 +02:00
Tom Gundersen
4ba125e393 pipeline: stop naming pipelines
This key carries no information and is never used anywhere. The json
files are not meant to be human readable, so simply drop this.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-09-29 18:59:45 +02:00
Lars Karlitski
d42b8113fa osbuild: allow reading pipeline from stdin 2019-09-27 18:22:46 +02:00
Ondřej Budai
3a0a480792 stages: Drop the remove-uniqueness stage
We don't need this stage anymore, random-seed has never been in
the tree and machine-id is now handled by dnf stage.
2019-09-26 23:47:33 +02:00
Ondřej Budai
fd8eb9492f stages/dnf: Remove random seed after dnf run
Some dnf packages introduce random seed file. If we leave in the tree
it would mean all systems running from the created image would use
the same random seed. This can be potentially dangerous, therefore we
just remove the generated random seed from our images.
2019-09-26 23:47:33 +02:00
Ondřej Budai
cc73fa5d10 stages/dnf: Improve dnf stage reproducibility
Normally, machine ID is generated randomly when dnf installs @Core
group. Unfortunately this isn't helping us with reproducibility
of images.

This commit introduces the concept of fake machine ID. Before dnf
command is run in dnf stage, we set the machine ID to a fake one.
This ensures all the scriptlets requiring machine ID have predictable
outputs.

For example GRUB uses machine-id in file names inside
/boot/loader/entries. With fixed machine ID the file names are always
the same and totally predictable.
2019-09-26 23:47:33 +02:00
Lars Karlitski
2ab9ba4e33 test: refactor boot test
Use the unittest module from the standard library. Also, ensure that
separate runs of this test don't share a osbuild store and clean up
after themselves.

With contributions from Ondřej Budai and Tom Gundersen.
2019-09-26 19:20:47 +02:00
Lars Karlitski
83475cc9f4 osbuild: store outputs in objectstore
Treat outputs like we treat trees: store them in the object store. This
simplifies using osbuild and allows returning a cached version if one is
available.

This makes the `--output` parameter redundant. Remove it.
2019-09-25 23:50:50 +02:00
Lars Karlitski
cb173f7d3c objectstore: refer to objects, not trees
Also simplify method names with redundant words:

  has_tree → contains
  get_tree → get
  new_tree → new
2019-09-25 23:50:50 +02:00
Lars Karlitski
9edeb19ebb osbuild: add --json argument
`osbuild --json [ARGS]` will suppress the normal output and print its
result as JSON. For now, it only does this when it returns 0. Otherwise,
it prints the error from the latest stage.

This is useful for other tools to call it and get machine-readable
output.
2019-09-25 23:50:50 +02:00
Lars Karlitski
635b041d84 pipeline: simplify return value of Pipeline.run()
The current implementation was broken, because it didn't return results
from the cached stages. Simpley return a boolean now, True for success.
2019-09-25 23:50:50 +02:00
Lars Karlitski
fd37a5d646 pipeline: introduce output id
Introduce and output id, which is the checksum over a full pipeline,
including all stages and the assembler. The id of a pipeline did not
include assemblers before. To be less confusing, rename the existing id
to "tree id".
2019-09-25 23:50:50 +02:00
Lars Karlitski
f1151a1719 objectstore: clarify ENOTEMPTY handling 2019-09-25 23:50:50 +02:00
Tom Gundersen
56a25adf7f travis: pin the pylint version
We only want to upgrade to a new version of pip after explicitly
opting in. Otherwise, PRs may randmly start failing just because
pylint was upgraded, but unrelated to the code change.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-09-25 14:30:46 +02:00
Lars Karlitski
7d39b5766d travis: disable python's output buffering
Python buffers stdout and stderr if they're not connected to the
terminal. This messes with the ordering of osbuild and stage output,
becasue stages produce output more quickly.

Globally disable output buffering on travis.
2019-09-25 13:46:40 +02:00
Lars Karlitski
3009b9255f stages: remove org.osbuild.anaconda
We haven't used or tested it.
2019-09-24 20:17:04 +02:00
Lars Karlitski
bbe4129f36 stages/dnf: remove dnf cache directory
The repository used to install the image might be unrelated to any
repository that's ever used on a system running that image.
2019-09-24 20:17:04 +02:00
Lars Karlitski
57c82a00d0 stages/dnf: verify repository checksum
Require "checksum" option for each repository, which contains the
checksum of the `repodata/repomd.xml` file. This file (indirectly)
contains checksums for all packages.

Verify that the metadata dnf downloaded to install packages matches that
checksum. This way, this stage will give an error when a reposiory
changed between putting together the pipeline and running it.
2019-09-24 20:17:04 +02:00
Lars Karlitski
e23b5a32a2 stages/yum: only write known options to repo file
This is similar to the previous commit for the dnf stage.

Don't pass through arbitrary options. This means that pipeline repo
objects don't have the same options as yum repo files anymore:

1. Hard code repo name to repo id. The name has no influence on the
resulting image and should thus not appear in a pipeline.

2. Set gpgcheck=1 when gpgkey is given. It defaults to false, which
means that all sample and test pipelines didn't verify packages. It
would have failed anyway, because the container doesn't have the key
referenced in /etc. Change all gpgkeys to refer to the key id and import
them manually.

3. Don't allow lists for baseurl and gpgkey. We can add that if we need
it at some point.

Also be less verbose.
2019-09-24 20:17:04 +02:00
Lars Karlitski
0dd939b658 stages/dnf: only write known options to repo file
Don't pass through arbitrary options. This means that pipeline repo
objects don't have the same options as dnf repo files anymore:

1. Hard code repo name to repo id. The name has no influence on the
resulting image and should thus not appear in a pipeline.

2. Set gpgcheck=1 when gpgkey is given. It defaults to false, which
means that all sample and test pipelines didn't verify packages. It
would have failed anyway, because the container doesn't have the key
referenced in /etc. Change all gpgkeys to refer to the key id and import
them manually.

3. Don't allow lists for baseurl and gpgkey. We can add that if we need
it at some point.
2019-09-24 20:17:04 +02:00
Lars Karlitski
93da5caa69 stages/dnf: add mandatory basearch argument
We've been effectively using the basearch of the host, making the stage
non-reproducible: if the same pipeline was run on machines with
different architectures, it would produce different results. However,
pipelines producing different outputs must be different. Thus, this
patch includes the basearch in the pipeline.

In principle, this allows cross-arch builds. dnf should be the only
stage running binaries from the target tree. This is not yet tested.
2019-09-24 20:17:04 +02:00