Object

Object Auditor

class swift.obj.auditor.AuditorWorker(conf, logger, rcache, devices, zero_byte_only_at_fps=0)

Bases: object

Walk through file system to audit objects

audit_all_objects(mode='once', device_dirs=None)
create_recon_nested_dict(top_level_key, device_list, item)
failsafe_object_audit(location)

Entrypoint to object_audit, with a failsafe generic exception handler.

object_audit(location)

Audits the given object location.

Parameters:location – an audit location (from diskfile.object_audit_location_generator)
record_stats(obj_size)

Based on config’s object_size_stats will keep track of how many objects fall into the specified ranges. For example with the following:

object_size_stats = 10, 100, 1024

and your system has 3 objects of sizes: 5, 20, and 10000 bytes the log will look like: {“10”: 1, “100”: 1, “1024”: 0, “OVER”: 1}

class swift.obj.auditor.ObjectAuditor(conf, **options)

Bases: swift.common.daemon.Daemon

Audit objects.

audit_loop(parent, zbo_fps, override_devices=None, **kwargs)

Parallel audit loop

clear_recon_cache(auditor_type)

Clear recon cache entries

fork_child(zero_byte_fps=False, **kwargs)

Child execution

run_audit(**kwargs)

Run the object audit

run_forever(*args, **kwargs)

Run the object audit until stopped.

run_once(*args, **kwargs)

Run the object audit once

Object Backend

Disk File Interface for the Swift Object Server

The DiskFile, DiskFileWriter and DiskFileReader classes combined define the on-disk abstraction layer for supporting the object server REST API interfaces (excluding REPLICATE). Other implementations wishing to provide an alternative backend for the object server must implement the three classes. An example alternative implementation can be found in the mem_server.py and mem_diskfile.py modules along size this one.

The DiskFileManager is a reference implemenation specific class and is not part of the backend API.

The remaining methods in this module are considered implementation specific and are also not considered part of the backend API.

class swift.obj.diskfile.AuditLocation(path, device, partition, policy)

Bases: object

Represents an object location to be audited.

Other than being a bucket of data, the only useful thing this does is stringify to a filesystem path so the auditor’s logs look okay.

class swift.obj.diskfile.BaseDiskFile(mgr, device_path, partition, account=None, container=None, obj=None, _datadir=None, policy=None, use_splice=False, pipe_size=None, use_linkat=False, **kwargs)

Bases: object

Manage object files.

This specific implementation manages object files on a disk formatted with a POSIX-compliant file system that supports extended attributes as metadata on a file or directory.

Note

The arguments to the constructor are considered implementation specific. The API does not define the constructor arguments.

The following path format is used for data file locations: <devices_path/<device_dir>/<datadir>/<partdir>/<suffixdir>/<hashdir>/ <datafile>.<ext>

Parameters:
  • mgr – associated DiskFileManager instance
  • device_path – path to the target device or drive
  • partition – partition on the device in which the object lives
  • account – account name for the object
  • container – container name for the object
  • obj – object name for the object
  • _datadir – override the full datadir otherwise constructed here
  • policy – the StoragePolicy instance
  • use_splice – if true, use zero-copy splice() to send data
  • pipe_size – size of pipe buffer used in zero-copy operations
  • use_linkat – if True, use open() with linkat() to create obj file
account
container
content_length
content_type
content_type_timestamp
create(*args, **kwds)

Context manager to create a file. We create a temporary file first, and then return a DiskFileWriter object to encapsulate the state.

Note

An implementation is not required to perform on-disk preallocations even if the parameter is specified. But if it does and it fails, it must raise a DiskFileNoSpace exception.

Parameters:size – optional initial size of file to explicitly allocate on disk
Raises DiskFileNoSpace:
 if a size is specified and allocation fails
data_timestamp
delete(timestamp)

Delete the object.

This implementation creates a tombstone file using the given timestamp, and removes any older versions of the object file. Any file that has an older timestamp than timestamp will be deleted.

Note

An implementation is free to use or ignore the timestamp parameter.

Parameters:timestamp – timestamp to compare with each file
Raises DiskFileError:
 this implementation will raise the same errors as the create() method.
durable_timestamp

Provides the timestamp of the newest data file found in the object directory.

Returns:A Timestamp instance, or None if no data file was found.
Raises DiskFileNotOpen:
 if the open() method has not been previously called on this instance.
fragments
classmethod from_hash_dir(mgr, hash_dir_path, device_path, partition, policy)
get_datafile_metadata()

Provide the datafile metadata for a previously opened object as a dictionary. This is metadata that was included when the object was first PUT, and does not include metadata set by any subsequent POST.

Returns:object’s datafile metadata dictionary
Raises DiskFileNotOpen:
 if the swift.obj.diskfile.DiskFile.open() method was not previously invoked
get_metadata()

Provide the metadata for a previously opened object as a dictionary.

Returns:object’s metadata dictionary
Raises DiskFileNotOpen:
 if the swift.obj.diskfile.DiskFile.open() method was not previously invoked
get_metafile_metadata()

Provide the metafile metadata for a previously opened object as a dictionary. This is metadata that was written by a POST and does not include any persistent metadata that was set by the original PUT.

Returns:object’s .meta file metadata dictionary, or None if there is no .meta file
Raises DiskFileNotOpen:
 if the swift.obj.diskfile.DiskFile.open() method was not previously invoked
manager
obj
open()

Open the object.

This implementation opens the data file representing the object, reads the associated metadata in the extended attributes, additionally combining metadata from fast-POST .meta files.

Note

An implementation is allowed to raise any of the following exceptions, but is only required to raise DiskFileNotExist when the object representation does not exist.

Raises:
  • DiskFileCollision – on name mis-match with metadata
  • DiskFileNotExist – if the object does not exist
  • DiskFileDeleted – if the object was previously deleted
  • DiskFileQuarantined – if while reading metadata of the file some data did pass cross checks
Returns:

itself for use as a context manager

read_metadata()

Return the metadata for an object without requiring the caller to open the object first.

Returns:metadata dictionary for an object
Raises DiskFileError:
 this implementation will raise the same errors as the open() method.
reader(keep_cache=False, _quarantine_hook=<function <lambda> at 0x7ffadef4c848>)

Return a swift.common.swob.Response class compatible “app_iter” object as defined by swift.obj.diskfile.DiskFileReader.

For this implementation, the responsibility of closing the open file is passed to the swift.obj.diskfile.DiskFileReader object.

Parameters:
  • keep_cache – caller’s preference for keeping data read in the OS buffer cache
  • _quarantine_hook – 1-arg callable called when obj quarantined; the arg is the reason for quarantine. Default is to ignore it. Not needed by the REST layer.
Returns:

a swift.obj.diskfile.DiskFileReader object

reader_cls = None
timestamp
write_metadata(metadata)

Write a block of metadata to an object without requiring the caller to create the object first. Supports fast-POST behavior semantics.

Parameters:metadata – dictionary of metadata to be associated with the object
Raises DiskFileError:
 this implementation will raise the same errors as the create() method.
writer_cls = None
class swift.obj.diskfile.BaseDiskFileManager(conf, logger)

Bases: object

Management class for devices, providing common place for shared parameters and methods not provided by the DiskFile class (which primarily services the object server REST API layer).

The get_diskfile() method is how this implementation creates a DiskFile object.

Note

This class is reference implementation specific and not part of the pluggable on-disk backend API.

Note

TODO(portante): Not sure what the right name to recommend here, as “manager” seemed generic enough, though suggestions are welcome.

Parameters:
  • conf – caller provided configuration object
  • logger – caller provided logger
cleanup_ondisk_files(hsh_path, reclaim_age=604800, **kwargs)

Clean up on-disk files that are obsolete and gather the set of valid on-disk files for an object.

Parameters:
  • hsh_path – object hash path
  • reclaim_age – age in seconds at which to remove tombstones
  • frag_index – if set, search for a specific fragment index .data file, otherwise accept the first valid .data file
Returns:

a dict that may contain: valid on disk files keyed by their filename extension; a list of obsolete files stored under the key ‘obsolete’; a list of files remaining in the directory, reverse sorted, stored under the key ‘files’.

consolidate_hashes(*args, **kwargs)
construct_dev_path(device)

Construct the path to a device without checking if it is mounted.

Parameters:device – name of target device
Returns:full path to the device
diskfile_cls = None
get_dev_path(device, mount_check=None)

Return the path to a device, first checking to see if either it is a proper mount point, or at least a directory depending on the mount_check configuration option.

Parameters:
  • device – name of target device
  • mount_check – whether or not to check mountedness of device. Defaults to bool(self.mount_check).
Returns:

full path to the device, None if the path to the device is not a proper mount point or directory.

get_diskfile(device, partition, account, container, obj, policy, **kwargs)

Returns a BaseDiskFile instance for an object based on the object’s partition, path parts and policy.

Parameters:
  • device – name of target device
  • partition – partition on device in which the object lives
  • account – account name for the object
  • container – container name for the object
  • obj – object name for the object
  • policy – the StoragePolicy instance
get_diskfile_from_audit_location(audit_location)

Returns a BaseDiskFile instance for an object at the given AuditLocation.

Parameters:audit_location – object location to be audited
get_diskfile_from_hash(device, partition, object_hash, policy, **kwargs)

Returns a DiskFile instance for an object at the given object_hash. Just in case someone thinks of refactoring, be sure DiskFileDeleted is not raised, but the DiskFile instance representing the tombstoned object is returned instead.

Parameters:
  • device – name of target device
  • partition – partition on the device in which the object lives
  • object_hash – the hash of an object path
  • policy – the StoragePolicy instance
Raises DiskFileNotExist:
 

if the object does not exist

Returns:

an instance of BaseDiskFile

get_hashes(device, partition, suffixes, policy)
Parameters:
  • device – name of target device
  • partition – partition name
  • suffixes – a list of suffix directories to be recalculated
  • policy – the StoragePolicy instance
Returns:

a dictionary that maps suffix directories

get_ondisk_files(files, datadir, verify=True, **kwargs)

Given a simple list of files names, determine the files that constitute a valid fileset i.e. a set of files that defines the state of an object, and determine the files that are obsolete and could be deleted. Note that some files may fall into neither category.

If a file is considered part of a valid fileset then its info dict will be added to the results dict, keyed by <extension>_info. Any files that are no longer required will have their info dicts added to a list stored under the key ‘obsolete’.

The results dict will always contain entries with keys ‘ts_file’, ‘data_file’ and ‘meta_file’. Their values will be the fully qualified path to a file of the corresponding type if there is such a file in the valid fileset, or None.

Parameters:
  • files – a list of file names.
  • datadir – directory name files are from.
  • verify – if True verify that the ondisk file contract has not been violated, otherwise do not verify.
Returns:

a dict that will contain keys:

ts_file -> path to a .ts file or None data_file -> path to a .data file or None meta_file -> path to a .meta file or None

and may contain keys:

ts_info -> a file info dict for a .ts file data_info -> a file info dict for a .data file meta_info -> a file info dict for a .meta file obsolete -> a list of file info dicts for obsolete files

invalidate_hash(*args, **kwargs)
make_on_disk_filename(timestamp, ext=None, ctype_timestamp=None, *a, **kw)

Returns filename for given timestamp.

Parameters:
  • timestamp – the object timestamp, an instance of Timestamp
  • ext – an optional string representing a file extension to be appended to the returned file name
  • ctype_timestamp – an optional content-type timestamp, an instance of Timestamp
Returns:

a file name

object_audit_location_generator(device_dirs=None, auditor_type='ALL')

Yield an AuditLocation for all objects stored under device_dirs.

Parameters:
  • device_dirs – directory of target device
  • auditor_type – either ALL or ZBF
parse_on_disk_filename(filename)

Parse an on disk file name.

Parameters:filename – the file name including extension
Returns:a dict, with keys for timestamp, ext and ctype_timestamp:
  • timestamp is a Timestamp
  • ctype_timestamp is a Timestamp or None for .meta files, otherwise None
  • ext is a string, the file extension including the leading dot or the empty string if the filename has no extension.

Subclasses may override this method to add further keys to the returned dict.

Raises DiskFileError:
 if any part of the filename is not able to be validated.
pickle_async_update(device, account, container, obj, data, timestamp, policy)

Write data describing a container update notification to a pickle file in the async_pending directory.

Parameters:
  • device – name of target device
  • account – account name for the object
  • container – container name for the object
  • obj – object name for the object
  • data – update data to be written to pickle file
  • timestamp – a Timestamp
  • policy – the StoragePolicy instance
quarantine_renamer(*args, **kwargs)
replication_lock(*args, **kwds)

A context manager that will lock on the device given, if configured to do so.

Parameters:device – name of target device
Raises ReplicationLockTimeout:
 If the lock on the device cannot be granted within the configured timeout.
yield_hashes(device, partition, policy, suffixes=None, **kwargs)

Yields tuples of (full_path, hash_only, timestamps) for object information stored for the given device, partition, and (optionally) suffixes. If suffixes is None, all stored suffixes will be searched for object hashes. Note that if suffixes is not None but empty, such as [], then nothing will be yielded.

timestamps is a dict which may contain items mapping:

  • ts_data -> timestamp of data or tombstone file,
  • ts_meta -> timestamp of meta file, if one exists
  • ts_ctype -> timestamp of meta file containing most recent
    content-type value, if one exists

where timestamps are instances of Timestamp

Parameters:
  • device – name of target device
  • partition – partition name
  • policy – the StoragePolicy instance
  • suffixes – optional list of suffix directories to be searched
yield_suffixes(device, partition, policy)

Yields tuples of (full_path, suffix_only) for suffixes stored on the given device and partition.

Parameters:
  • device – name of target device
  • partition – partition name
  • policy – the StoragePolicy instance
class swift.obj.diskfile.BaseDiskFileReader(fp, data_file, obj_size, etag, disk_chunk_size, keep_cache_size, device_path, logger, quarantine_hook, use_splice, pipe_size, diskfile, keep_cache=False)

Bases: object

Encapsulation of the WSGI read context for servicing GET REST API requests. Serves as the context manager object for the swift.obj.diskfile.DiskFile class’s swift.obj.diskfile.DiskFile.reader() method.

Note

The quarantining behavior of this method is considered implementation specific, and is not required of the API.

Note

The arguments to the constructor are considered implementation specific. The API does not define the constructor arguments.

Parameters:
  • fp – open file object pointer reference
  • data_file – on-disk data file name for the object
  • obj_size – verified on-disk size of the object
  • etag – expected metadata etag value for entire file
  • disk_chunk_size – size of reads from disk in bytes
  • keep_cache_size – maximum object size that will be kept in cache
  • device_path – on-disk device path, used when quarantining an obj
  • logger – logger caller wants this object to use
  • quarantine_hook – 1-arg callable called w/reason when quarantined
  • use_splice – if true, use zero-copy splice() to send data
  • pipe_size – size of pipe buffer used in zero-copy operations
  • diskfile – the diskfile creating this DiskFileReader instance
  • keep_cache – should resulting reads be kept in the buffer cache
app_iter_range(start, stop)

Returns an iterator over the data file for range (start, stop)

app_iter_ranges(ranges, content_type, boundary, size)

Returns an iterator over the data file for a set of ranges

can_zero_copy_send()
close()

Close the open file handle if present.

For this specific implementation, this method will handle quarantining the file if necessary.

manager
zero_copy_send(wsockfd)

Does some magic with splice() and tee() to move stuff from disk to network without ever touching userspace.

Parameters:wsockfd – file descriptor (integer) of the socket out which to send data
class swift.obj.diskfile.BaseDiskFileWriter(name, datadir, fd, tmppath, bytes_per_sync, diskfile)

Bases: object

Encapsulation of the write context for servicing PUT REST API requests. Serves as the context manager object for the swift.obj.diskfile.DiskFile class’s swift.obj.diskfile.DiskFile.create() method.

Note

It is the responsibility of the swift.obj.diskfile.DiskFile.create() method context manager to close the open file descriptor.

Note

The arguments to the constructor are considered implementation specific. The API does not define the constructor arguments.

Parameters:
  • name – name of object from REST API
  • datadir – on-disk directory object will end up in on swift.obj.diskfile.DiskFileWriter.put()
  • fd – open file descriptor of temporary file to receive data
  • tmppath – full path name of the opened file descriptor
  • bytes_per_sync – number bytes written between sync calls
  • diskfile – the diskfile creating this DiskFileWriter instance
commit(timestamp)

Perform any operations necessary to mark the object as durable. For replication policy type this is a no-op.

Parameters:timestamp – object put timestamp, an instance of Timestamp
manager
put(metadata)

Finalize writing the file on disk.

Parameters:metadata – dictionary of metadata to be associated with the object
put_succeeded
write(chunk)

Write a chunk of data to disk. All invocations of this method must come before invoking the :func:

For this implementation, the data is written into a temporary file.

Parameters:chunk – the chunk of data to write as a string object
Returns:the total number of bytes written to an object
class swift.obj.diskfile.DiskFile(mgr, device_path, partition, account=None, container=None, obj=None, _datadir=None, policy=None, use_splice=False, pipe_size=None, use_linkat=False, **kwargs)

Bases: swift.obj.diskfile.BaseDiskFile

reader_cls

alias of DiskFileReader

writer_cls

alias of DiskFileWriter

class swift.obj.diskfile.DiskFileManager(conf, logger)

Bases: swift.obj.diskfile.BaseDiskFileManager

diskfile_cls

alias of DiskFile

class swift.obj.diskfile.DiskFileReader(fp, data_file, obj_size, etag, disk_chunk_size, keep_cache_size, device_path, logger, quarantine_hook, use_splice, pipe_size, diskfile, keep_cache=False)

Bases: swift.obj.diskfile.BaseDiskFileReader

class swift.obj.diskfile.DiskFileRouter(*args, **kwargs)

Bases: object

policy_type_to_manager_cls = {'erasure_coding': <class 'swift.obj.diskfile.ECDiskFileManager'>, 'replication': <class 'swift.obj.diskfile.DiskFileManager'>}
classmethod register(policy_type)

Decorator for Storage Policy implementations to register their DiskFile implementation.

class swift.obj.diskfile.DiskFileWriter(name, datadir, fd, tmppath, bytes_per_sync, diskfile)

Bases: swift.obj.diskfile.BaseDiskFileWriter

put(metadata)

Finalize writing the file on disk.

Parameters:metadata – dictionary of metadata to be associated with the object
class swift.obj.diskfile.ECDiskFile(*args, **kwargs)

Bases: swift.obj.diskfile.BaseDiskFile

durable_timestamp

Provides the timestamp of the newest durable file found in the object directory.

Returns:A Timestamp instance, or None if no durable file was found.
Raises DiskFileNotOpen:
 if the open() method has not been previously called on this instance.
fragments

Provides information about all fragments that were found in the object directory, including fragments without a matching durable file, and including any fragment chosen to construct the opened diskfile.

Returns:A dict mapping <Timestamp instance> -> <list of frag indexes>, or None if the diskfile has not been opened or no fragments were found.
purge(timestamp, frag_index)

Remove a tombstone file matching the specified timestamp or datafile matching the specified timestamp and fragment index from the object directory.

This provides the EC reconstructor/ssync process with a way to remove a tombstone or fragment from a handoff node after reverting it to its primary node.

The hash will be invalidated, and if empty or invalid the hsh_path will be removed on next cleanup_ondisk_files.

Parameters:
  • timestamp – the object timestamp, an instance of Timestamp
  • frag_index – fragment archive index, must be a whole number or None.
reader_cls

alias of ECDiskFileReader

writer_cls

alias of ECDiskFileWriter

class swift.obj.diskfile.ECDiskFileManager(conf, logger)

Bases: swift.obj.diskfile.BaseDiskFileManager

diskfile_cls

alias of ECDiskFile

make_on_disk_filename(timestamp, ext=None, frag_index=None, ctype_timestamp=None, *a, **kw)

Returns the EC specific filename for given timestamp.

Parameters:
  • timestamp – the object timestamp, an instance of Timestamp
  • ext – an optional string representing a file extension to be appended to the returned file name
  • frag_index – a fragment archive index, used with .data extension only, must be a whole number.
  • ctype_timestamp – an optional content-type timestamp, an instance of Timestamp
Returns:

a file name

Raises DiskFileError:
 

if ext==’.data’ and the kwarg frag_index is not a whole number

parse_on_disk_filename(filename)

Returns timestamp(s) and other info extracted from a policy specific file name. For EC policy the data file name includes a fragment index which must be stripped off to retrieve the timestamp.

Parameters:filename – the file name including extension
Returns:
a dict, with keys for timestamp, frag_index, ext and
ctype_timestamp:
  • timestamp is a Timestamp
  • frag_index is an int or None
  • ctype_timestamp is a Timestamp or None for .meta files, otherwise None
  • ext is a string, the file extension including the leading dot or the empty string if the filename has no extension.
Raises DiskFileError:
 if any part of the filename is not able to be validated.
validate_fragment_index(frag_index)

Return int representation of frag_index, or raise a DiskFileError if frag_index is not a whole number.

Parameters:frag_index – a fragment archive index
class swift.obj.diskfile.ECDiskFileReader(fp, data_file, obj_size, etag, disk_chunk_size, keep_cache_size, device_path, logger, quarantine_hook, use_splice, pipe_size, diskfile, keep_cache=False)

Bases: swift.obj.diskfile.BaseDiskFileReader

class swift.obj.diskfile.ECDiskFileWriter(name, datadir, fd, tmppath, bytes_per_sync, diskfile)

Bases: swift.obj.diskfile.BaseDiskFileWriter

commit(timestamp)

Finalize put by writing a timestamp.durable file for the object. We do this for EC policy because it requires a 2-phase put commit confirmation.

Parameters:timestamp – object put timestamp, an instance of Timestamp
put(metadata)

The only difference between this method and the replication policy DiskFileWriter method is adding the frag index to the metadata.

Parameters:metadata – dictionary of metadata to be associated with object
swift.obj.diskfile.clear_auditor_status(devices, auditor_type='ALL')
swift.obj.diskfile.consolidate_hashes(partition_dir)

Take what’s in hashes.pkl and hashes.invalid, combine them, write the result back to hashes.pkl, and clear out hashes.invalid.

Parameters:suffix_dir – absolute path to partition dir containing hashes.pkl and hashes.invalid
Returns:a dict, the suffix hashes (if any), the key ‘valid’ will be False if hashes.pkl is corrupt, cannot be read or does not exist
swift.obj.diskfile.extract_policy(obj_path)

Extracts the policy for an object (based on the name of the objects directory) given the device-relative path to the object. Returns None in the event that the path is malformed in some way.

The device-relative path is everything after the mount point; for example:

/srv/node/d42/objects-5/30/179/
485dc017205a81df3af616d917c90179/1401811134.873649.data

would have device-relative path:

objects-5/30/179/485dc017205a81df3af616d917c90179/1401811134.873649.data

Parameters:obj_path – device-relative path of an object, or the full path
Returns:a BaseStoragePolicy or None
swift.obj.diskfile.get_auditor_status(datadir_path, logger, auditor_type)
swift.obj.diskfile.invalidate_hash(suffix_dir)

Invalidates the hash for a suffix_dir in the partition’s hashes file.

Parameters:suffix_dir – absolute path to suffix dir whose hash needs invalidating
swift.obj.diskfile.object_audit_location_generator(devices, mount_check=True, logger=None, device_dirs=None, auditor_type='ALL')

Given a devices path (e.g. “/srv/node”), yield an AuditLocation for all objects stored under that directory if device_dirs isn’t set. If device_dirs is set, only yield AuditLocation for the objects under the entries in device_dirs. The AuditLocation only knows the path to the hash directory, not to the .data file therein (if any). This is to avoid a double listdir(hash_dir); the DiskFile object will always do one, so we don’t.

Parameters:
  • devices – parent directory of the devices to be audited
  • mount_check – flag to check if a mount check should be performed on devices
  • logger – a logger object
  • device_dirs – a list of directories under devices to traverse
  • auditor_type – either ALL or ZBF
swift.obj.diskfile.quarantine_renamer(device_path, corrupted_file_path)

In the case that a file is corrupted, move it to a quarantined area to allow replication to fix it.

Params device_path:
 The path to the device the corrupted file is on.
Params corrupted_file_path:
 The path to the file you want quarantined.
Returns:path (str) of directory the file was moved to
Raises OSError:re-raises non errno.EEXIST / errno.ENOTEMPTY exceptions from rename
swift.obj.diskfile.read_hashes(partition_dir)

Read the existing hashes.pkl

Returns:a dict, the suffix hashes (if any), the key ‘valid’ will be False if hashes.pkl is corrupt, cannot be read or does not exist
swift.obj.diskfile.read_metadata(fd)

Helper function to read the pickled metadata from an object file.

Parameters:fd – file descriptor or filename to load the metadata from
Returns:dictionary of metadata
swift.obj.diskfile.strip_self(f)

Wrapper to attach module level functions to base class.

swift.obj.diskfile.update_auditor_status(datadir_path, logger, partitions, auditor_type)
swift.obj.diskfile.write_hashes(partition_dir, hashes)

Write hashes to hashes.pkl

The updated key is added to hashes before it is written.

swift.obj.diskfile.write_metadata(fd, metadata, xattr_size=65536)

Helper function to write pickled metadata for an object file.

Parameters:
  • fd – file descriptor or filename to write the metadata
  • metadata – metadata to write

Object Replicator

class swift.obj.replicator.ObjectReplicator(conf, logger=None)

Bases: swift.common.daemon.Daemon

Replicate objects.

Encapsulates most logic and data needed by the object replication process. Each call to .replicate() performs one replication pass. It’s up to the caller to do this in a loop.

build_replication_jobs(policy, ips, override_devices=None, override_partitions=None)

Helper function for collect_jobs to build jobs for replication using replication style storage policy

check_ring(object_ring)

Check to see if the ring has been updated :param object_ring: the ring to check

Returns:boolean indicating whether or not the ring has changed
collect_jobs(override_devices=None, override_partitions=None, override_policies=None)

Returns a sorted list of jobs (dictionaries) that specify the partitions, nodes, etc to be rsynced.

Parameters:
  • override_devices – if set, only jobs on these devices will be returned
  • override_partitions – if set, only jobs on these partitions will be returned
  • override_policies – if set, only jobs in these storage policies will be returned
delete_handoff_objs(job, delete_objs)
delete_partition(path)
detect_lockups()

In testing, the pool.waitall() call very occasionally failed to return. This is an attempt to make sure the replicator finishes its replication pass in some eventuality.

heartbeat()

Loop that runs in the background during replication. It periodically logs progress.

kill_coros()

Utility function that kills all coroutines currently running.

load_object_ring(policy)

Make sure the policy’s rings are loaded.

Parameters:policy – the StoragePolicy instance
Returns:appropriate ring object
replicate(override_devices=None, override_partitions=None, override_policies=None)

Run a replication pass

rsync(node, job, suffixes)

Uses rsync to implement the sync method. This was the first sync method in Swift.

run_forever(*args, **kwargs)
run_once(*args, **kwargs)
ssync(node, job, suffixes, remote_check_objs=None)
stats_line()

Logs various stats for the currently running replication pass.

sync(node, job, suffixes, *args, **kwargs)

Synchronize local suffix directories from a partition with a remote node.

Parameters:
  • node – the “dev” entry for the remote node to sync with
  • job – information about the partition being synced
  • suffixes – a list of suffixes which need to be pushed
Returns:

boolean and dictionary, boolean indicating success or failure

update(job)

High-level method that replicates a single partition.

Parameters:job – a dict containing info about the partition to be replicated
update_deleted(job)

High-level method that replicates a single partition that doesn’t belong on this node.

Parameters:job – a dict containing info about the partition to be replicated
class swift.obj.ssync_sender.Sender(daemon, node, job, suffixes, remote_check_objs=None)

Bases: object

Sends SSYNC requests to the object server.

These requests are eventually handled by ssync_receiver and full documentation about the process is there.

connect()

Establishes a connection and starts an SSYNC request with the object server.

disconnect()

Closes down the connection to the object server once done with the SSYNC request.

missing_check()

Handles the sender-side of the MISSING_CHECK step of a SSYNC request.

Full documentation of this can be found at Receiver.missing_check().

readline()

Reads a line from the SSYNC response body.

httplib has no readline and will block on read(x) until x is read, so we have to do the work ourselves. A bit of this is taken from Python’s httplib itself.

send_delete(url_path, timestamp)

Sends a DELETE subrequest with the given information.

send_post(url_path, df)
send_put(url_path, df)

Sends a PUT subrequest for the url_path using the source df (DiskFile) and content_length.

updates()

Handles the sender-side of the UPDATES step of an SSYNC request.

Full documentation of this can be found at Receiver.updates().

swift.obj.ssync_sender.decode_wanted(parts)

Parse missing_check line parts to determine which parts of local diskfile were wanted by the receiver.

The encoder for parts is encode_wanted()

swift.obj.ssync_sender.encode_missing(object_hash, ts_data, ts_meta=None, ts_ctype=None)

Returns a string representing the object hash, its data file timestamp and the delta forwards to its metafile and content-type timestamps, if non-zero, in the form: <hash> <ts_data> [m:<hex delta to ts_meta>[,t:<hex delta to ts_ctype>]]

The decoder for this line is decode_missing()

class swift.obj.ssync_receiver.Receiver(app, request)

Bases: object

Handles incoming SSYNC requests to the object server.

These requests come from the object-replicator daemon that uses ssync_sender.

The number of concurrent SSYNC requests is restricted by use of a replication_semaphore and can be configured with the object-server.conf [object-server] replication_concurrency setting.

An SSYNC request is really just an HTTP conduit for sender/receiver replication communication. The overall SSYNC request should always succeed, but it will contain multiple requests within its request and response bodies. This “hack” is done so that replication concurrency can be managed.

The general process inside an SSYNC request is:

  1. Initialize the request: Basic request validation, mount check, acquire semaphore lock, etc..
  2. Missing check: Sender sends the hashes and timestamps of the object information it can send, receiver sends back the hashes it wants (doesn’t have or has an older timestamp).
  3. Updates: Sender sends the object information requested.
  4. Close down: Release semaphore lock, etc.
initialize_request()

Basic validation of request and mount check.

This function will be called before attempting to acquire a replication semaphore lock, so contains only quick checks.

missing_check()

Handles the receiver-side of the MISSING_CHECK step of a SSYNC request.

Receives a list of hashes and timestamps of object information the sender can provide and responds with a list of hashes desired, either because they’re missing or have an older timestamp locally.

The process is generally:

  1. Sender sends :MISSING_CHECK: START and begins sending hash timestamp lines.

  2. Receiver gets :MISSING_CHECK: START and begins reading the hash timestamp lines, collecting the hashes of those it desires.

  3. Sender sends :MISSING_CHECK: END.

  4. Receiver gets :MISSING_CHECK: END, responds with :MISSING_CHECK: START, followed by the list of <wanted_hash> specifiers it collected as being wanted (one per line), :MISSING_CHECK: END, and flushes any buffers.

    Each <wanted_hash> specifier has the form <hash>[ <parts>] where <parts> is a string containing characters ‘d’ and/or ‘m’ indicating that only data or meta part of object respectively is required to be sync’d.

  5. Sender gets :MISSING_CHECK: START and reads the list of hashes desired by the receiver until reading :MISSING_CHECK: END.

The collection and then response is so the sender doesn’t have to read while it writes to ensure network buffers don’t fill up and block everything.

updates()

Handles the UPDATES step of an SSYNC request.

Receives a set of PUT and DELETE subrequests that will be routed to the object server itself for processing. These contain the information requested by the MISSING_CHECK step.

The PUT and DELETE subrequests are formatted pretty much exactly like regular HTTP requests, excepting the HTTP version on the first request line.

The process is generally:

  1. Sender sends :UPDATES: START and begins sending the PUT and DELETE subrequests.
  2. Receiver gets :UPDATES: START and begins routing the subrequests to the object server.
  3. Sender sends :UPDATES: END.
  4. Receiver gets :UPDATES: END and sends :UPDATES: START and :UPDATES: END (assuming no errors).
  5. Sender gets :UPDATES: START and :UPDATES: END.

If too many subrequests fail, as configured by replication_failure_threshold and replication_failure_ratio, the receiver will hang up the request early so as to not waste any more time.

At step 4, the receiver will send back an error if there were any failures (that didn’t cause a hangup due to the above thresholds) so the sender knows the whole was not entirely a success. This is so the sender knows if it can remove an out of place partition, for example.

swift.obj.ssync_receiver.decode_missing(line)

Parse a string of the form generated by encode_missing() and return a dict with keys object_hash, ts_data, ts_meta, ts_ctype.

The encoder for this line is encode_missing()

swift.obj.ssync_receiver.encode_wanted(remote, local)

Compare a remote and local results and generate a wanted line.

Parameters:
  • remote – a dict, with ts_data and ts_meta keys in the form returned by decode_missing()
  • local – a dict, possibly empty, with ts_data and ts_meta keys in the form returned Receiver._check_local()

The decoder for this line is decode_wanted()

Object Server

Object Server for Swift

class swift.obj.server.EventletPlungerString

Bases: str

Eventlet won’t send headers until it’s accumulated at least eventlet.wsgi.MINIMUM_CHUNK_SIZE bytes or the app iter is exhausted. If we want to send the response body behind Eventlet’s back, perhaps with some zero-copy wizardry, then we have to unclog the plumbing in eventlet.wsgi to force the headers out, so we use an EventletPlungerString to empty out all of Eventlet’s buffers.

class swift.obj.server.ObjectController(conf, logger=None)

Bases: swift.common.base_storage_server.BaseStorageServer

Implements the WSGI application for the Swift Object Server.

DELETE(ctrl, *args, **kwargs)

Handle HTTP DELETE requests for the Swift Object Server.

GET(ctrl, *args, **kwargs)

Handle HTTP GET requests for the Swift Object Server.

HEAD(ctrl, *args, **kwargs)

Handle HTTP HEAD requests for the Swift Object Server.

POST(ctrl, *args, **kwargs)

Handle HTTP POST requests for the Swift Object Server.

PUT(ctrl, *args, **kwargs)

Handle HTTP PUT requests for the Swift Object Server.

REPLICATE(ctrl, *args, **kwargs)

Handle REPLICATE requests for the Swift Object Server. This is used by the object replicator to get hashes for directories.

Note that the name REPLICATE is preserved for historical reasons as this verb really just returns the hashes information for the specified parameters and is used, for example, by both replication and EC.

SSYNC(ctrl, *args, **kwargs)
async_update(op, account, container, obj, host, partition, contdevice, headers_out, objdevice, policy, logger_thread_locals=None)

Sends or saves an async update.

Parameters:
  • op – operation performed (ex: ‘PUT’, or ‘DELETE’)
  • account – account name for the object
  • container – container name for the object
  • obj – object name
  • host – host that the container is on
  • partition – partition that the container is on
  • contdevice – device name that the container is on
  • headers_out – dictionary of headers to send in the container request
  • objdevice – device name that the object is in
  • policy – the associated BaseStoragePolicy instance
  • logger_thread_locals – The thread local values to be set on the self.logger to retain transaction logging information.
container_update(op, account, container, obj, request, headers_out, objdevice, policy)

Update the container when objects are updated.

Parameters:
  • op – operation performed (ex: ‘PUT’, or ‘DELETE’)
  • account – account name for the object
  • container – container name for the object
  • obj – object name
  • request – the original request object driving the update
  • headers_out – dictionary of headers to send in the container request(s)
  • objdevice – device name that the object is in
  • policy – the BaseStoragePolicy instance
delete_at_update(op, delete_at, account, container, obj, request, objdevice, policy)

Update the expiring objects container when objects are updated.

Parameters:
  • op – operation performed (ex: ‘PUT’, or ‘DELETE’)
  • delete_at – scheduled delete in UNIX seconds, int
  • account – account name for the object
  • container – container name for the object
  • obj – object name
  • request – the original request driving the update
  • objdevice – device name that the object is in
  • policy – the BaseStoragePolicy instance (used for tmp dir)
get_diskfile(device, partition, account, container, obj, policy, **kwargs)

Utility method for instantiating a DiskFile object supporting a given REST API.

An implementation of the object server that wants to use a different DiskFile class would simply over-ride this method to provide that behavior.

server_type = 'object-server'
setup(conf)

Implementation specific setup. This method is called at the very end by the constructor to allow a specific implementation to modify existing attributes or add its own attributes.

Parameters:conf – WSGI configuration parameter
swift.obj.server.app_factory(global_conf, **local_conf)

paste.deploy app factory for creating WSGI object server apps

swift.obj.server.drain(file_like, read_size, timeout)

Read and discard any bytes from file_like.

Parameters:
  • file_like – file-like object to read from
  • read_size – how big a chunk to read at a time
  • timeout – how long to wait for a read (use None for no timeout)
Raises ChunkReadTimeout:
 

if no chunk was read in time

swift.obj.server.global_conf_callback(preloaded_app_conf, global_conf)

Callback for swift.common.wsgi.run_wsgi during the global_conf creation so that we can add our replication_semaphore, used to limit the number of concurrent SSYNC_REQUESTS across all workers.

Parameters:
  • preloaded_app_conf – The preloaded conf for the WSGI app. This conf instance will go away, so just read from it, don’t write.
  • global_conf – The global conf that will eventually be passed to the app_factory function later. This conf is created before the worker subprocesses are forked, so can be useful to set up semaphores, shared memory, etc.
swift.obj.server.iter_mime_headers_and_bodies(wsgi_input, mime_boundary, read_chunk_size)

Object Updater

class swift.obj.updater.ObjectUpdater(conf, logger=None)

Bases: swift.common.daemon.Daemon

Update object information in container listings.

get_container_ring()

Get the container ring. Load it, if it hasn’t been yet.

object_sweep(device)

If there are async pendings on the device, walk each one and update.

Parameters:device – path to device
object_update(node, part, op, obj, headers_out)

Perform the object update to the container

Parameters:
  • node – node dictionary from the container ring
  • part – partition that holds the container
  • op – operation performed (ex: ‘PUT’ or ‘DELETE’)
  • obj – object name being updated
  • headers_out – headers to send with the update
process_object_update(update_path, device, policy)

Process the object information to be updated and update.

Parameters:
  • update_path – path to pickled object update file
  • device – path to device
  • policy – storage policy of object update
run_forever(*args, **kwargs)

Run the updater continuously.

run_once(*args, **kwargs)

Run the updater once.

swift.obj.updater.random() → x in the interval [0, 1).

Table Of Contents

Previous topic

Account DB and Container DB

Next topic

Misc

Project Source

This Page