tar
tape archives, used to package files, the file format is portable.
Packaging (or archiving, because it doesn't compress):
# -c创建文件,-f指定文件名
tar -cf bundle.tar file1 file2...
# 支持通配符
tar -cf bash_bundle.tar "*.sh"
Appending:
# -r向已存在的包中添加一个文件
tar -rf bash_bundle.tar new.sh
# 对比时间戳,比同名文件新的话才添加
# vv是为了输出详细日志,没有就表示文件不新,没往进塞
tar -uvvf bash_bundle.tar new.sh
Deletion:
# --delete删除包里的文件
tar -f sh.tar --delete test.sh
P.S. Mac does not have the --delete option
Viewing:
# -t查看包内容
tar -tf bash_bundle.tar
# -v看详细点的(文件权限、修改日期,类似于ls -l)
tar -tvf bash_bundle.tar
# -vv看更详细的(比上面多一行包文件格式信息)
tar -tvvf bash_bundle.tar
P.S. -v and -vv can be used with other options to output logs.
Extraction (Decompression):
# -x提取到当前目录
tar -xf bash_bundle.tar
# -C提取到指定目录(目录必须已存在,否则报错)
tar -xf bash_bundle.tar -C ./tmp
# 只提取指定文件
tar -C ./tmp -xf bash_bundle.tar ab.diff
Strange tricks:
# stdin/stdout
# 把打包结果输出到stdout
tar -cf - test.sh
# 从stdin读取包内容
tar -xf - -C ./tmp test.sh
Paired with ssh, you can pipe directly into the remote machine for batch file transfers:
# 本地打包,远程提取(用来同步目录)
tar -cf - test.sh | ssh <user>@<IP> "mkdir -p ~/tmp/sh; tar -xf - -C ~/tmp/sh"
# 本地打包,远程保存(用来批量上传文件)
tar -cf - test.sh | ssh jiajiejie.jj@10.125.1.214 "mkdir -p ~/tmp; cat > ~/tmp/sh.tar"
# 把远程文件提取到本地(用来批量下载文件)
ssh jiajiejie.jj@10.125.1.214 "cat ~/tmp/sh.tar" | tar -xf - -C ./tmp
Reducing intermediate files and disk R/W makes it more efficient.
tar defaults to archiving only, used to package files without compression, but it provides compression options:
# -z压缩为zip格式
tar -a -cf bash.tar.gz "*.sh"
# -j压缩为bunzip2格式
tar -a -cf bash.tar.bz2 "*.sh"
# --lzma压缩为lzma格式(Mac下没有该选项)
tar -a -cf bash.tar.lzma
tar -a -cf filename.tar.lzo
The -a/--autocompress option can automatically select the compression format based on the filename, as shown in the example above. When decompressing, you need to specify the compression format, as in the common compilation and installation method:
# 下载源码
wget http://path/to/source.tar.gz
# 解压
tar -zxvf source.tar.gz
# 或者,-a自动检测压缩格式
tar -axvf source.tar.gz
# 三板斧
cd source
./configure
make
make install
Other options and usage:
# -A合并包(把2合并到1)
tar -Af bundle1.tar bundle2.tar
# -d比较包里外的文件
tar -df sh1.tar test.sh
# --exclude排除指定文件(排除md文件)
tar -cf bundle.tar "*" --exclude "*.md"
# 或者把需要排除的文件名写入文件,通过-X选项排除
echo "*.md" > tar.ignore
tar -cf bundle.tar "*" -X tar.ignore
# 排除版本控制目录(.git, .svn之类的)
tar --exclude-vcs -zcvf proj.tar.gz ./proj
# --totals输出包文件大小
tar -zcvf dir.tar.gz "*" --totals
P.S. Mac does not have -d or --totals options, and older versions of tar do not support --exclude-vcs.
cpio
Similar to tar, it receives input filenames from stdin and outputs the packaged file to stdout. It's often used for rpm software packages but is not commonly used.
Its characteristic is support for absolute paths: tar converts absolute paths to relative paths when packaging, while cpio does not. If an absolute path is input during packaging, it will be restored according to the absolute path during extraction; otherwise, it behaves like tar and extracts to the current directory:
# 只能从stdin接收文件名
# 打包,-o指定输出文件名,-v输出文件列表
find . -name "*.sh" -print | cpio -ov > bash.cpio
# 查看,-i指定输入包名,-t列出包内容
cpio -vit < bundle.cpio
# 提取,-d表示提取操作
cpio -vid < bundle.cpio
Note: cpio has no prompts for overwriting files. If the file at the absolute path already exists and is older, it will be silently replaced. During extraction, it automatically compares timestamps; if the file in the package is newer, it replaces it; otherwise, it skips extracting the file.
P.S. To decompress an rpm package with cpio, you first need to convert the rpm package to a cpio package, which requires the rpm2cpio tool.
gzip/gunzip, zcat
These 3 commands can all handle gzip compressed files. The gzip command can only compress single files and cannot directly handle directories or multiple files. Therefore, it is generally recommended to use the tar command to package files first, then use gzip to compress them.
gzip/gunzip
Compression:
# 会删除test.sh,再生成test.sh.gz
gzip test.sh
Decompression:
# 删除test.sh.gz,生成test.sh
gunzip test.sh.gz
Viewing:
# -l列出包内文件名、压缩前后大小、压缩比
gzip -l test.sh.gz
Also used in conjunction with stdin/stdout:
# -c输出到stdout
cat sub.sh | gzip -c > sub.sh.gz
This preserves the original file sub.sh.
Other options and usage:
# --fast/--best指定压缩级别,分别对应最低/最高压缩比
# 一共有9级,--fast对应1,--best对应9
gzip test.sh --fast
# 等价于
gzip test.sh -1
# tar的-z选项使用gzip压缩
tar -zcvf bash.tar.gz "*.sh"
# 或者,-a自动检测压缩格式
tar -acvf bash.tar.gz "*.sh"
# 或者,先打包再压缩
tar -cvf bash.tar "*.sh"; gzip bash.tar
zcat
Reads gzip compressed file content directly without decompression and outputs to stdout:
# 读取gz文件内容
zcat test.sh.gz
P.S. On Mac, zcat forcibly appends a .Z suffix to the input filename, causing an error:
zcat: can't stat: sub.sh.gz (sub.sh.gz.Z): No such file or directory
Therefore, to ensure portability, it is not recommended to use zcat; you can use gunzip -c instead. For more information, please see zcat on OS X always appends a .Z to the filename (better use gunzip -c)
bzip2/bunzip2
Generally, it has a higher compression ratio than gzip, and its usage is exactly the same as gzip:
# 压缩
# 会删除test.sh,生成test.sh.bz2
bzip2 test.sh
# 解压
bunzip2 test.sh.bz2
Actual testing found that for the text file test.sh, under the same maximum compression (-9), bzip2 has a slightly lower compression ratio than gzip:
-rwxr-xr-x 1 ayqy staff 1064 4 9 16:31 test.sh
-rwxr-xr-x 1 ayqy staff 682 4 9 16:31 test.sh.bz2
-rwxr-xr-x 1 ayqy staff 632 4 9 16:31 test.sh.gz
Similarly, almost everything gzip has, bzip2 also has:
# 指定压缩级别
bzip2 -1 test.sh
# tar -j选项压缩成bz2
tar -jcvf bash.tar.gz "*.sh"
# ...同gzip
Additionally, there are some unique features (bzip2 has them, while gzip does not):
# -k保留输入文件
bzip2 -k test.sh
P.S. There is also a newer compression tool called lzma/unlzma, which is said to have a higher compression ratio. It is generally not pre-installed and needs to be installed manually. Its usage is the same as gzip/bzip2, and it supports all the options of both.
zip
A very common compression format. The compression ratio is not very high, but many network resources use this format.
Compression:
# 生成test.sh.zip,不删除test.sh
zip test.sh.zip test.sh
# -r递归处理目录
zip -r bundle.zip .
Decompression:
# 解压到当前目录,不会删除test.sh.zip
unzip test.sh.zip
If the target file is found to already exist, it will prompt for options to replace/rename/cancel.
Updating:
# -u用新文件替掉包里的
zip test.sh.zip -u test.sh
Deletion:
# -d删除包里指定文件
zip -d test.sh.zip test.sh
Viewing:
# -l列出包内容
unzip -l test.sh.zip
Encryption/Encoding
Linux provides many encryption/encoding tools: crypt, gpg, base64, etc.
crypt
Receives file input and password from stdin and outputs the encrypted result to stdout.
Encryption:
# 交互提示输入口令
crypt < test.sh
# 把加密结果重定向到文件
crypt < test.sh > test.lock.sh
Decryption:
# 同样,只接受来自stdin的,只输出到stdout
crypt 口令 < test.lock.sh > test.sh
P.S. This command is not available on Mac.
gpg
GNU Privacy Guard, using the key signature method. Simple usage is as follows:
# 加密,交互提示输入口令,生成test.sh.gpg
gpg -c test.sh
# 解压,交互提示输入口令
gpg test.sh.gpg
P.S. This command is not available on Mac.
base64
Different from the two commands above, it is easily decoded and has little difference from plain text, so it can only be considered an encoding method:
# 编码
base64 test.sh > test.sh.base64
# 解码
base64 -D test.sh.base64 > test.sh
rsync
rsync is used for system snapshot backups, with built-in diff and compression mechanisms, making it more efficient than commands like scp. It also supports network data transmission, comparing source and target files, only copying and backing up updated files, and supporting encryption options.
Backup:
# 备份到本地
# 在当前目录创建bash.bak/bash,复制下面所有内容
# -a归档,-v输出log
rsync -av bash bash.bak
# 备份到远程
rsync -av bash ayqy@<IP>:~/bak
Note: Path format matters. If the source path ends with /, it only copies all files/subfiles under it to the target path; otherwise, it creates the corresponding folder in the target path and then copies all files/subfiles under it. In short, with / no folder is created; the / at the end of the target path has a similar meaning.
Regular backup only requires executing the same command periodically; it automatically checks for differences and updates, and then performs the backup.
Restore:
# 从本地恢复
rsync -av bash.bak bash
# 从远端恢复
rsync -av ayqy@<IP>:~/bak bash
Just swap the parameter positions.
Other options and features:
# -z压缩传输
rsync -zav bash bash.bak
# --exclude排除指定文件
rsync -av bash bash.bak --exclude "*.md"
# --delete备份时删除不存在的文件,默认不会删掉源端已经删掉的东西
rsync -av bash bash.bak --exclude --delete
No comments yet. Be the first to share your thoughts.