It's sometimes useful to set up a loop device for an already formatted
disk/filesystem image to derive new artifacts from it. In that case, we
want to make sure it's impossible to modify its contents in any way in
that process, both for our own purposes and for other stages operating
on it.
Notably, mounting some filesystems read-only still seem to touch the
disk (like XFS).
When we create the device node inside the buildroot so far it's
very minimal - just `/dev/{vg}-{lv}` with the appopriate major/minor.
However when mount runs it will create a mapper device with the
same major/minor under `/dev/mapper/{escaped(vg)}-{escaped(lv)}`
and use that to mount the actual filesystem. Without this additional
device findmnt will not be able to detect the udev attributes of
the source (as the source is just missing from /dev).
This commit create the right mapper in the same way that we
create the non-mapper device node.
This commit extract a helper `get_parent_path()` that is unit
tested and also uses this generated parent_path for the call
to manage_devices_file to be consistent with the exiting behavior
of only including the device that actually contains the VG.
This is needed for bootc where all mounts need to be from the same
physical disk/loop so that bootupd works. The idea is that in the
manifest the new option `vg_partnum` is added and the parent VG
is found via the partition number of the full image similar to
the `partnum` from https://github.com/osbuild/osbuild/pull/1501
A manifest using this feature looks like this:
```json
"devices": {
"disk": {
"type": "org.osbuild.loopback",
"options": {
"filename": "disk.raw",
"partscan": true
}
},
"rootlv": {
"type": "org.osbuild.lvm2.lv",
"parent": "disk",
"options": {
"volume": "rootlv",
"vg_partnum": 4
}
}
}
```
Co-authored-by: Michael Vogt <mvogt@redhat.com>
We can now add an entire device and then get the partitions added
to our environment for use, rather than to have to map each partition
in to a separate loopback device.
This is a prep patch for https://github.com/osbuild/osbuild/issues/1495
For the org.osbuild.loopback the user can set the sector size, but
it had no effect on the underlying loopback device. Let's make it
meaningful by passing along the given value to the underlying code.
Apply the isort modifications to the entire source tree, not just the
selected python files of test-src.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
This fixes pylint warnings on our modules that are currently not part of
CI-pylint. The fixes should all be straightforward.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
LVM2 introduced system.devices as an alternative way to filter
devices. Since we create devices in a stage the devices won't be
added to the /etc/lvm/devices/system.devices file since /etc/ is
inside the container. As a result the we can't see these devices
and will fail with "Could not find parent device".
Therefore we add support for managing our own per-service devices
file, iff a `system.devices` is present.
In all invocations of `subprocess.run` stderr and stdout were both
combined in a shared pipe, but lvm sometimes spits out notices and
informational messages on stderr and thus potentially interfering
with the data we are interested in on stdout. Separate the two.
This is the corresponding device for LUKS containers created via the
new `org.osbuild.luks2.format` stage. Needed information are the
parent device and the passphrase used to create the container.
NB: this device always uses the new custom block device udev rule
inhibitor facility.
This gets set in `open` and used in `close` but if the former
fails the latter will explode if we do not properly initialize
it. Also, we should always properly initialize things.
That ironically fix the underlying bug that flush_buf is trying to
fix too, since now an exception is thrown and we are back to auto
clear. The file fd is then closed when the process is terminated.
Anyway, the right fix is to call the correct function.
Manually clear the buffer cache of the loop device, which seems to
be required in order to make sure that data written via the loop
device is actually landing in the file:
Since commit c1379f6 the file descriptor of the loop device is
explicitly cleared. This broke manifests that involved creating a
FAT filesystem. Said file system could later not be mounted. The
breaking change was identified to indeed be commit c1379f6. Using
`biosnoop` we saw that some write operations were missing when
clearing the file descriptor that were present when using the
auto-clearing feature of the loop device (see below). Reading the
corresponding kernel source (v5.13.8), the current theory is that
when using the auto clear feature, once the last handle on the
loop device is closed, the code path in the kernel is:
blkdev_close (fs/block_dev.c)
blkdev_put (fs/block_dev.c)
__blkdev_put (fs/block_dev.c)
sync_blockdev (fs/block_dev.c)
On the other hand when manually clearing the file descriptor, the
code path seems to be:
loop_clr_fd (fs/loop.c)
__loop_clr_fd (fs/loop.c)
The latter first removes the backing file and then calls `bdput`,
and thus no call to sync_blockdev is made.
Luckily, sync_blockdev can be called via an ioctl, `BLKFLSBUF`,
which we no do, via the new helper function `lo.flush_buf`. This
fixes the observed issue and leads to the same biosnoop trace as
observed when using the auto clear feature without explicitly
clearing the fd.
NB: we considered reverting the commit c1379f6, but we want to make
sure that we control to point when the backing file is cleared from
the fd, since sub-sequent osbuild stages will re-use the file and
we want to ensure no loop device still has the file open and that
all the data in is in the file.
-- biosnoop trace --
4.115946 mkfs.fat 731297 loop1 R 0 4096 0.08
4.116096 mkfs.fat 731297 loop1 R 8 4096 0.02
4.116176 mkfs.fat 731297 loop1 R 16 4096 0.02
[...]
4.120632 mkfs.fat 731297 loop1 R 400 4096 0.02
4.200354 org.osbuild.lo 731281 vda W 4182432 32768 0.64
4.200429 org.osbuild.lo 731281 vda W 6279584 32768 0.70
4.200657 ? 0 R 0 0 0.19
4.200946 org.osbuild.lo 731281 vda W 3328128 4096 0.20
4.201109 ? 0 R 0 0 0.13
[the following entires were missing with manual flushing:]
4.201601 org.osbuild.lo 731281 loop1 W 0 4096 0.24
4.201634 org.osbuild.lo 731281 loop1 W 8 4096 0.26
4.201645 org.osbuild.lo 731281 loop1 W 16 4096 0.27
[...]
4.203118 org.osbuild.lo 731281 loop1 W 432 4096 0.25
Reported-by: Achilleas Koutsou <achilleas@koutsou.net>
Reported-by: Tomas Hozza <thozza@redhat.com>
Implement a new `lock` option (default: False), which will lock
the device by passing `lock=True` to `LoopControl.loop_for_fd`.
The main purpose for this is to block udev from probing the device
while the stage is run.
NB: some tools might also try to lock the device and fail.
Explicitly clear the fd. There might be a race with udev or some
other process that still as a reference to the loop device so we
might not be able to immediately clear it. Thus wait for it with
a timeout of 30 seconds which should hopefully be enough. If the
device does not clear then any consecutive action that involves
it might not be safe to execute so we let the timeout exception
be reported to osbuild.
Getting unbound loop devices is racy, so we do it in a retry loop:
in case of a recoverable error, like when the loop device signals
it is busy, we close the it and try another one. Indeed the code,
closed the loop device but did in fact not open a new one and we
would therefore see file descriptor cannot be a negative integer
errors when trying in `lo.set_fd(fd)` since `lo` is in fact closed
now and thus indeed '-1'.
Open a new loop device at the beginning of the retry-loop to fix
this issue.
Show the file descriptor that was opened for the file passed to
the device. Recently, in CI, we have seen errors opening the
loop device with `fd` being `-1` and this ensures that at least
the file itself could be opened.
On rare occasions, when trying to mount a file system via the
loopback device that was created in the stage before the mount
fails with `wrong fs type, bad option, bad superblock on ...`.
It is not yet fully clear why this appears since in theory the
kernel should handle the sequence of operations we do, but it
might be down to the caching or bugs in the underlying file-
system.
To mitigate with potential caching issue we therefore now sync
the file that is backing the loopback device after the device
has been closed. In order to not have to re-open the file we
keep the file descriptor around as long as the loopback device
is open.
Device service that provides support for bind files within the tree
to loopback devices. Valid parameters are the `filename`, `offset`
and `size`. This controls what part of the file to bind to the loop
device. The unit for `size` and `offset` is sectors and the sector
size can be configured via the `sector-size` parameter. The reason
behind the sector unit is so that numbers can easily be compared
with those specified in the partition table.