Archiving

Archiving is the process of taking multiple files and packaging them into a single one. Common utilities include tar and cpio. tar will archive and compress the files, while cpio is more customizable and will generally work with other tools.

Some common tar flags:

FlagDescription
-cCreate an archive
-xExtract files
-tList contents
-rAppend to existing archive
-fSpecifiy archive file name
-zCompress with gzip
-jCompress with bzip2
Unlike tar, cpio gets a list of files from the standard output of another command, like find. The tool has two main modes, -o (Create archive, copy-out), -i (Extrach archive, copy-in). -u is used to override without prompting. For example:
find /etc -name "*.conf" | cpio -ov > etc-configs.cpio
cpio -iuv < etc-configs.cpio

Compression / Decompression

There are multiple, common compression tools used:

ToolDecompression toolDescription
gzipgunzipCommonly used, fast and simple
bzip2bunzip2Better compression than gzip, but slower
xzunxzEven higher compression, useful for long-term storage

Data copy and recovery

dd is used to create exact byte-for-byte copies of disks, partitions, or files. Common inputs:

InputDescription
if=Input file/device
of=Output file/device
bs=Block size
count=How many blocks to copy
status=Control progress reporting

Example

dd if=/dev/sda of=/mnt/backup/disk.img bs=4M status=progress This command will make a block copy of dev/sda with a block size of 4M, saving it at /mnt/backup/disk.img. This is the same command that can be used to flash a USB drive with an ISO, where the .iso file is the input file and the drive itself is the output file, i.e. /dev/sdX.

ddrescure is used to recover data from damaged disks. For example, if a drive is failing one could use ddrescue /dev/sdX corrupted.img rescue.log. This will attempt to rescue the data from sdX, save it in corrupted.img, and keep an activity log in rescue.log.

rsync is a fancy cp where it will incrementally copy files, and can optionally keep metadata such as permissions, etc.