cinder.image.format_inspector module

This is a python implementation of virtual disk format inspection routines gathered from various public specification documents, as well as qemu disk driver code. It attempts to store and parse the minimum amount of data required, and in a streaming-friendly manner to collect metadata about complex-format images.

class CaptureRegion(offset, length)

Bases: object

Represents a region of a file we want to capture.

A region of a file we want to capture requires a byte offset into the file and a length. This is expected to be used by a data processing loop, calling capture() with the most recently-read chunk. This class handles the task of grabbing the desired region of data across potentially multiple fractional and unaligned reads.

Parameters:
  • offset – Byte offset into the file starting the region

  • length – The length of the region

capture(chunk, current_position)

Process a chunk of data.

This should be called for each chunk in the read loop, at least until complete returns True.

Parameters:
  • chunk – A chunk of bytes in the file

  • current_position – The position of the file processed by the read loop so far. Note that this will be the position in the file after the chunk being presented.

property complete

Returns True when we have captured the desired data.

class FileInspector(tracing=False)

Bases: object

A stream-based disk image inspector.

This base class works on raw images and is subclassed for more complex types. It is to be presented with the file to be examined one chunk at a time, during read processing and will only store as much data as necessary to determine required attributes of the file.

property actual_size

Returns the total size of the file.

This is usually smaller than virtual_size. NOTE: this will only be accurate if the entire file is read and processed.

property complete

Returns True if we have all the information needed.

property context_info

Return info on amount of data held in memory for auditing.

This is a dict of region:sizeinbytes items that the inspector uses to examine the file.

eat_chunk(chunk)

Call this to present chunks of the file to the inspector.

property format_match

Returns True if the file appears to be the expected format.

classmethod from_file(filename)

Read as much of a file as necessary to complete inspection.

NOTE: Because we only read as much of the file as necessary, the actual_size property will not reflect the size of the file, but the amount of data we read before we satisfied the inspector.

Raises ImageFormatError if we cannot parse the file.

has_region(name)

Returns True if named region has been defined.

new_region(name, region)

Add a new CaptureRegion by name.

post_process()

Post-read hook to process what has been read so far.

This will be called after each chunk is read and potentially captured by the defined regions. If any regions are defined by this call, those regions will be presented with the current chunk in case it is within one of the new regions.

region(name)

Get a CaptureRegion by name.

safety_check()

Perform some checks to determine if this file is safe.

Returns True if safe, False otherwise. It may raise ImageFormatError if safety cannot be guaranteed because of parsing or other errors.

property virtual_size

Returns the virtual size of the disk image, or zero if unknown.

exception ImageFormatError

Bases: Exception

An unrecoverable image format error that aborts the process.

class InfoWrapper(source, fmt)

Bases: object

A file-like object that wraps another and updates a format inspector.

This passes chunks to the format inspector while reading. If the inspector fails, it logs the error and stops calling it, but continues proxying data from the source to its user.

close()
read(size)
class QEDInspector(tracing=False)

Bases: FileInspector

property format_match

Returns True if the file appears to be the expected format.

safety_check()

Perform some checks to determine if this file is safe.

Returns True if safe, False otherwise. It may raise ImageFormatError if safety cannot be guaranteed because of parsing or other errors.

class QcowInspector(*a, **k)

Bases: FileInspector

QEMU QCOW2 Format

This should only require about 32 bytes of the beginning of the file to determine the virtual size, and 104 bytes to perform the safety check.

BF_OFFSET = 8
BF_OFFSET_LEN = 8
I_FEATURES = 72
I_FEATURES_DATAFILE_BIT = 3
I_FEATURES_LEN = 8
I_FEATURES_MAX_BIT = 4
property format_match

Returns True if the file appears to be the expected format.

property has_backing_file
property has_data_file
property has_header
property has_unknown_features
safety_check()

Perform some checks to determine if this file is safe.

Returns True if safe, False otherwise. It may raise ImageFormatError if safety cannot be guaranteed because of parsing or other errors.

safety_check_allow_backing_file()
property virtual_size

Returns the virtual size of the disk image, or zero if unknown.

class TraceDisabled

Bases: object

A logger-like thing that swallows tracing when we do not want it.

debug(*a, **k)
error(*a, **k)
info(*a, **k)
warning(*a, **k)
class VDIInspector(*a, **k)

Bases: FileInspector

VirtualBox VDI format

This only needs to store the first 512 bytes of the image.

property format_match

Returns True if the file appears to be the expected format.

property virtual_size

Returns the virtual size of the disk image, or zero if unknown.

class VHDInspector(*a, **k)

Bases: FileInspector

Connectix/MS VPC VHD Format

This should only require about 512 bytes of the beginning of the file to determine the virtual size.

property format_match

Returns True if the file appears to be the expected format.

property virtual_size

Returns the virtual size of the disk image, or zero if unknown.

class VHDXInspector(*a, **k)

Bases: FileInspector

MS VHDX Format

This requires some complex parsing of the stream. The first 256KiB of the image is stored to get the header and region information, and then we capture the first metadata region to read those records, find the location of the virtual size data and parse it. This needs to store the metadata table entries up until the VDS record, which may consist of up to 2047 32-byte entries at max. Finally, it must store a chunk of data at the offset of the actual VDS uint64.

METAREGION = '8B7CA206-4790-4B9A-B8FE-575F050F886E'
VHDX_METADATA_TABLE_MAX_SIZE = 65536
VIRTUAL_DISK_SIZE = '2FA54224-CD1B-4876-B211-5DBED83BF4B8'
property format_match

Returns True if the file appears to be the expected format.

post_process()

Post-read hook to process what has been read so far.

This will be called after each chunk is read and potentially captured by the defined regions. If any regions are defined by this call, those regions will be presented with the current chunk in case it is within one of the new regions.

property virtual_size

Returns the virtual size of the disk image, or zero if unknown.

class VMDKInspector(*a, **k)

Bases: FileInspector

vmware VMDK format (monolithicSparse and streamOptimized variants only)

This needs to store the 512 byte header and the descriptor region which should be just after that. The descriptor region is some variable number of 512 byte sectors, but is just text defining the layout of the disk.

DESC_MAX_SIZE = 1048575
DESC_OFFSET = 512
GD_AT_END = 18446744073709551615
property format_match

Returns True if the file appears to be the expected format.

post_process()

Post-read hook to process what has been read so far.

This will be called after each chunk is read and potentially captured by the defined regions. If any regions are defined by this call, those regions will be presented with the current chunk in case it is within one of the new regions.

safety_check()

Perform some checks to determine if this file is safe.

Returns True if safe, False otherwise. It may raise ImageFormatError if safety cannot be guaranteed because of parsing or other errors.

property virtual_size

Returns the virtual size of the disk image, or zero if unknown.

chunked_reader(fileobj, chunk_size=512)
detect_file_format(filename)

Attempts to detect the format of a file.

This runs through a file one time, running all the known inspectors in parallel. It stops reading the file once one of them matches or all of them are sure they don’t match.

Returns the FileInspector that matched, if any. None if ‘raw’.

get_inspector(format_name)

Returns a FormatInspector class based on the given name.

Parameters:

format_name – The name of the disk_format (raw, qcow2, etc).

Returns:

A FormatInspector or None if unsupported.