0. Preface
The filtering rules of the rsync command are very powerful, but there is almost no detailed information on this topic on the Internet, which makes it difficult to learn and understand.Official website documentationandman rsync
The manual is in English, and there are no examples and is difficult to understand. Even if someone translates the manual, it will be difficult to understand.
Therefore, with this tutorial, we will give a more thorough explanation of the relevant concepts, logic and usage techniques of the filtering rules based on actual commands. I hope that friends who are interested in studying can get started quickly and save time. Why is it called an entry-level? Because rsync's filtering rules also include some advanced skills, I personally feel that there are fewer opportunities to use in actual projects and are not included in this tutorial.
This tutorial is mainly divided into the following three parts:
- 1. Overview: Explain the relevant concepts and internal operation logic of rsync filtering rules
- 2. Operations and modifiers: Explain operations and modifiers in filtering rules
- 3. Matching pattern: explain the matching pattern in filtering rules
- 4. Examples of usage scenarios: Use specific commands to explain the techniques of using filtering rules
1. Overview
1.1 What are rsync filtering rules?
The rsync filtering rule is some rules used to define which files are to be transferred (including certain files) and which files are not transferred (excluding certain files). These rules can be written directly in command parameters, or in rule files, and then referenced by commands. For example:
# Command 1.1-1: The rules are written directly in the command parameters# --include="*.php" --exclude="*" These two parameters have two filtering rules configured# This command will only synchronize the php files in the src_dir directory (not including subdirectories)rsync -av --include="*.php" --exclude="*" src_dir/ dst_dir/ # Command 1.1-2: Configure rules through rule files# For rule files, you can use relative or absolute pathsrsync -av -f ". ./" src_dir/ dst_dir/ rsync -av -f ". /www/" src_dir/ dst_dir/
The contents of the rule file are as follows (specific syntax description later):
# Synchronize only php files+ *.php - *
1.2 What are the ways to configure filtering rules?
The options related to filtering rules include:
- --include=PATTERN: Set a rule containing, such as:
--include="*.php"
- --exclude=PATTERN: Set an exclusion rule, such as:
--exclude="*"
- --include-from=FILE : Specify a file, one line in the file contains rules,
;
or#
The behavior comment at the beginning, the blank line is ignored - --exclude-from=FILE : Specify a file, one line of exclusion rule in the file, to
;
or#
The behavior comment at the beginning, the blank line is ignored - --filter=RULE, -f : Set a filtering rule, which may be to exclude or include rules (such as:
-f "- *.php"
) It is also possible to have other types of rules (such as including a rule file-f ". ./"
)
The rules for these options configuration are essentially the same, but there are some differences in the description. in--filter=RULE, -f
Options support complete rule expression syntax, and other options can be converted into expressions of this option. As an example:
# The following commands are completely equivalent, the -f method is simpler, and the following text will use the -f method to express it more# --include="xxx" is equivalent to -f "+ xxx"# --exclude="xxx" is equivalent to -f "- xxx"rsync -av --include="*.php" --exclude="*" src_dir/ dst_dir/ rsync -av -filter "+ *.php" -filter "- *" src_dir/ dst_dir/ rsync -av -f "+ *.php" -f "- *" src_dir/ dst_dir/
The rule file introduced by the --include-from or --exclude-from options is equivalent to automatically adding each rule before it.+
or-
. The following are listed in the rules file class:
# --include-from or --exclude-from rules file referenced without + or -*.php *
1.3 How filtering rules work
All filtering rule configurations will eventually form an ordered list of rules inside rsync, with the options in front of the command line, and the ordering of the rules is also front-line. As the list of (file/directory paths) to be transferred is built, for each path, rsync reads the rules in the ordered rules list in turn for checking. When the first rule matches successfully, perform operations (including or exclude) immediately and stop checking the subsequent rules; if all rules do not match, this path defaults to include. If there is a recursive option (such as-r
-a
), if the subdirectory is excluded by the rule, then rsync will not recursively check the contents of this directory, that is, the files and directories under this directory will not be filtered by rules or transmitted, which means that the entire subdirectory will be excluded. A brief summary is as follows:
- Rules have order, same order of parameter configuration in the command (from left to right)
- Scan the file system, read the path (directory or file) and perform the rule check immediately to determine whether to include or exclude
- Rule checks are performed in order. If there is a hit, it will be included or excluded immediately, and subsequent rule checks will be stopped; if there is no hit, the default is included.
- When a directory is excluded, all subdirectories and files below are excluded.
As an example:
# This command will synchronize all php files in the src_dir directory (not including subdirectories)# Because each file and directory path in the src_dir directory will be checked first by the -f "+ *.php" rule. If the php file complies with this rule, it will be included. The subsequent -f "- *" rule will not be checked.# The remaining types of files and subdirectories do not comply with the first rule. When continuing to check the second rule, they are all excluded.# After the subdirectory is excluded, even if there is a php file, it will not be checked, so this command will not synchronize the php file in the subdirectory of the src_dir directory.rsync -av -f "+ *.php" -f "- *" src_dir/ dst_dir/ # If you change the order of two parameters, it will be completely different# This command will not synchronize any files and directories# Because the first rule excludes all files and directories, the second rule has no chance of taking effectrsync -av -f "- *" -f "+ *.php" src_dir/ dst_dir/ # This command will synchronize all php files (including subdirectories) in the src_dir directory# Rule -f "+ */" will match and include all subdirectories, so rsync will check all subdirectories, and then the php file in it will be included.rsync -av -f "+ *.php" -f "+ */" -f "- *" src_dir/ dst_dir/
1.4 Syntax for configuring filtering rules
The syntax of the filtering rule is as follows:
operate [Matching mode] operate,Modifier [Matching mode]
Operation: Such as+
(Included) and-
(exclude). There are also previous referenced rules files (-f ". ./"
)of.
The meaning of the number is the operation containing the rule file. There are some other details in the following text.
Modifier: Some behaviors that can be used to change the rules, as detailed in the following text.
Match pattern: A pattern used to match a string, similar to a regular expression, can be used to check whether a string complies with a certain pattern. If it matches, it is said to be a match, or it is hit. The brackets indicate that the matching pattern is optional because of some special operations, there is no matching pattern.
The space between the operation and the matching pattern must be an English space, and can also be used_
Character substitution, when writing rules directly on the command line, quotation marks can be omitted. For example:
# The following two commands are completely equivalent# When writing quotation marks omitted, please note that the * sign is unexpectedly expanded to a path.# It is recommended to use spaces with quotes to make it clearer and saferrsync -av -f "+ *.php" -f "+ */" -f "- *" src_dir/ dst_dir/ rsync -av -f +_*.php -f +_*/ -f -_* src_dir/ dst_dir/
1.5 Test Methods
If you need to modify the command repeatedly for testing, according to the actual synchronization method, you may need to constantly delete the files in the target directory, which is somewhat inconvenient. There is a solution, to slightly change the rsync command, remove the target directory and not write it, which means that the data is not actually synchronized, and only lists the files/directories that need to be synchronized. For example:
# Original commandrsync -av -f "+ *.php" -f "+ */" -f "- *" src_dir/ dst_dir/ # After modification# This command does not actually sync data, but will only output the list of files/directories to be synchronized.rsync -av -f "+ *.php" -f "+ */" -f "- *" src_dir/
2. Operations and modifiers
2.1 Operation
Rule types are determined by operations, and different operations can be regarded as different rule types. There are 9 operations in total. Each operation has a corresponding logo, which is divided into long logo and short logo. Generally, short logo is used in configuration.
| Serial number | Long ID | Short ID | Operating Instructions | Note | | ---- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Specify the mode to hide files from transfer | | | | 6 | show | S | Specify certain files will not be hidden | | | | 7 | protect | P | Specify certain files will not be deleted | | | | 8 | risk | R | Specify certain files will not be protected | | | | 9 | clear | ! | Clear the current include/exclude list | No match pattern |
The most basic and most commonly used operations are the first 3, which are easier to understand. They have been explained in the previous examples and will not be repeated here.
2.2 Modifier
Modifiers can only be used for inclusion/exclusion operations (+/-), and there are 7 types in total. When the operation uses a short sign, the comma in the middle can be omitted.
1. Modifier/
The matching pattern after the include/exclude operation is originally used to match the relative path of the transmission directory. After adding this modifier, the relative path will be converted into an absolute path and matched, and the matching method and rules will remain unchanged. For example:
# Assume that the absolute path of src_dir is not: /www/src_dir # Command 2.2-1: This command will synchronize all php files under src_dir, but not include# When scanning a file, its relative path is ''. Convert to absolute path to '/www/src_dir/'# When using the '-f "-/ src_dir/"' rule check, match (see the matching pattern after the following text) and then exclude.rsync -av -f "-/ src_dir/" -f "+ *.php" -f "- *" src_dir/ # Similarly, this command has the same effectrsync -av -f "-/ /www/src_dir/" -f "+ *.php" -f "- *" src_dir/
2. Modifier!
Indicates negative, inverse the matching result, treat the matching success as a match failure, and treat the matching failure as a match success. For example:
# This command only synchronizes php files under src_dir# When scanning files and directories in the src_dir directory, the matching of directories and files other than php files fails, and the inverse becomes a match successfully. Exclude# The php file eventually becomes a match failure, and any rules are missed, and it is retained by defaultrsync -av -f "-! *.php" src_dir/ # After testing, the / and ! two modifiers can be used at the same time# Modify command 2.2-1# This command will only sync filesrsync -av -f "-/! src_dir/" -f "+ *.php" -f "- *" src_dir/ # But the following command reported an errorrsync -av -f "-!/ src_dir/" -f "+ *.php" -f "- *" src_dir/ # But it's normal after changing to single quotes. It may be that some characters in double quotes were accidentally parsed# That is, when multiple modifiers are used, the order does not affect# This is another experience. If necessary, please use single quotesrsync -av -f '-!/ src_dir/' -f "+ *.php" -f "- *" src_dir/
3. ModifierC
Haven't studied it carefully, skipped
4. Modifiers
Haven't studied it carefully, skipped
5. Modifierr
Haven't studied it carefully, skipped
6. Modifierp
Haven't studied it carefully, skipped
7. Modifierx
Haven't studied it carefully, skipped
3. Matching mode
Inclusion and exclusion rules have a matching pattern, such as filtering rules- *.php
String in*.php
This is the matching pattern for this rule. Used to check the matching transfer path (the relative path of the file or directory in the original directory to be synchronized, with a modifier/
is an absolute path). The matching pattern is a string that describes the path characteristics, such as*.php
What is described is that the name () at the end of the path must be.php
Ending. The functions and usage are similar to regular expressions, but are simpler than regular expressions.
Transfer path: refers to the relative path of a file or directory in src_dir, and the format is similar to the following:
config/ config/ config/ config/ routes/ routes/ routes/ routes/ routes/
The matching pattern will be to match these paths.
Regarding matching patterns, I have summarized a total of 10 pattern rules:
- Mode to
/
At the beginning, it means that the pattern must match the beginning of the path; otherwise, it can match any layer name in the path. like:/*.php
- Mode to
/
At the end, it means that the pattern only matches directories, otherwise it can match directories or files. like:config/
- In the middle of the pattern
/
Indicates a path separator. like:subdir/
-
*
Match any character of any length, but not/
. like:*.php
-
**
Match any character of any length at the end (including/
). like:app/**
Match paths"app/xx/xx/"
-
***
Match any character of any length at the end (including/
), and also includes the directory itself. like:app/***
Match paths"app" and "xxx/app"
-
?
match/
Any other character -
[]
Match a certain type of character, such as:[a-z]
Match a lowercase letter,[0-9]
Match a number - The pattern must match to the end of the path by default. like:
foo
match"foo"
and"xx/foo"
, but not match"xx/foo1"
and"foo/xx"
- The part where the path is matched must contain a complete (directory or file) name and cannot be cut from the middle of the name. like:
foo
Missing"xxx/afoo"
,abc/foo
Missing"subdir/aabc/foo"
The following examples illustrate the use of each rule:
4. Examples of usage scenarios
4.1 Scenario: Exclude some directories or files from synchronization
This requirement is relatively simple because all paths are included by default. You only need to set exclusion rules and exclude the corresponding directories or files.
# Exclude app and vendor directory from synchronization# Note that this command will exclude directories or files whose names are app or vendor because Schema Rule 1rsync -av -f '- app' -f '- vendor' src_dir/ # This command only excludes app and vendor in the first-level subdirectory of src_dirrsync -av -f '- /app/' -f '- /vendor' src_dir/ # Exclude deeper subdirectoriesrsync -av -f '- /app/Admin' -f '- /vendor' src_dir/ # Exclude hidden files/directories and php files# Because pattern rules 9, 10, 4, pattern `.*` is equivalent to requiring that the last name (file or directory) of the path must start with `.`rsync -av -f '- .*' -f '- *.php' src_dir/
4.2 Scenario: Only the specified subdirectories are synchronized
Because if the transmission path is hit by no rule, it is included (synchronized), if you want to synchronize only the specified directory, you need to actively write filtering rules to exclude paths that do not need to be synchronized. The overall idea is to write the rules that contain the paths to be synchronized first, and then exclude the remaining paths. Start by simply synchronizing the first-level subdirectories
Only synchronize first-level subdirectories
All the following commands are equal, but the implementation ideas are different. The effect is: only synchronize the config directory (including all subdirectories and files)
# Command 4.2-1: Mode rules 1 and 4 are used# Due to pattern rule 1, the -f '- /*' rule excludes all files and subdirectories outside the str_dir directory.# When scanning the config directory recursively, all subdirectories and files are included by default because they do not hit any rules.rsync -av -f '+ config' -f '- /*' src_dir/ # Command 4.2-2: Mode rules 4, 5, 9 are used# -f '+ config' matches the config directory and contains# -f '+ config/**' Matches the paths of all subdirectories and files in the config directory, including. Pattern Rule 5# All remaining paths are matched by -f '- *' and excluded. Pattern Rules 4, 9rsync -av -f '+ config/**' -f '+ config' -f '- *' src_dir/ #Command 4.2-3:# -f '+ config/***' Matches the config directory and all its subdirectories and files. Rule 6rsync -av -f '+ config/***' -f '- *' src_dir/ #Command 4.2-4:# Use modifier! Inverse, exclude paths other than the config directory and all its subdirectories and file paths# It is equivalent to only synchronizing the config directoryrsync -av -f '-! config/***' src_dir/
Extension: Only synchronize multiple first-level subdirectories
Only synchronize config and app directories (including all subdirectories and files)
# Note that the exclusion rules of the two commands are different. You can refer to the pattern rule 1rsync -av -f '+ config' -f '+ app' -f '- /*' src_dir/ rsync -av -f '+ config/***' -f '+ app/***' -f '- *' src_dir/
Only synchronize deeper subdirectories
Imitate the writing method of only synchronizing the first-level subdirectories, and may directly write the command asrsync -av -f '+ app/Admin' -f '- /*' src_dir/
. But this command doesn't work as expected. The reason is that when rsync scans the app path, it cannot hit it.-f '+ app/Admin'
Rules, then the following rules are excluded. Then there is nothing, this command will not sync any files.
After understanding this problem, fix it and then write a command to only synchronize the app/Admin subdirectories (including all subdirectories and files). The following are:
# Command 4.2-5:# When scanning to a first-level subdirectory app, its path will not match the rule -f '+ app/Admin/' because pattern rule 9# The app path is included in the -f '+ app' hit. If this rule is removed, this command will not synchronize any files and directories# -f '- /*' -f '- /*/*' The two rules are to exclude all first-level subdirectories/files and second-level subdirectories/files that are not included by the previous rules.# The subdirectories/files under app/Admin/ will not be hit rules, and the default one containsrsync -av -f '+ app/Admin/' -f '+ app' -f '- /*' -f '- /*/*' src_dir/ # Command 4.2-6: The effect is the same as command 3-5# Ideas same as command 4.2-4rsync -av -f '+ app' -f '-! app/Admin/***' -f '- /*' src_dir/ #Command 4.2-6: The effect is the same as Command 3-6, but the idea is different # -av -f '+ app' rule contains app # -av -f '+ app' rule contains app/Admin/ directory and all subdirectories and files # The remaining paths are excluded by -f "- *" rsync -av -f '+ app' -f '+ app/Admin/***' -f "- *" src_dir/ # Command 4.2-7 # -f '-! app/***' rule uses inverse, only app subdirectory is retained # -f '+ app/Admin/' contains the app/Admin/ directory, and the remaining secondary subdirectories are ruled -f "- /*/*" exclusion# All subdirectories and files under the app/Admin/ directory, no rules hit, retainrsync -av -f '-! app/***' -f '+ app/Admin/' -f "- /*/*" src_dir/
Extension: Only synchronize multiple deep-level subdirectories
Similar to synchronizing only one deep subdirectory, you can also write many different commands. However, after asking about reducing errors, it is recommended to use command 4.2-6, which is relatively concise. Add inclusion rules to all parent directories of deep subdirectories, and then add deep subdirectories***
The inclusion rule of , and finally an exclusion rule.
rsync -av -f '+ /app' -f '+ /app/Admin/***' -f '+ /vendor' -f '+ /vendor/psy' -f "- *" src_dir/
4.3 Scenario: Quickly copy the directory structure
Sometimes it is necessary to create a directory with the same subdirectory hierarchy as another directory structure, but does not require the files in it. This can be done quickly with the rsync command.
# Use modifier! Inverse, paths other than directories are excluded. Pattern Rule 1rsync -av -f '-! */' src_dir/ dst_dir/
5. Summary
This article mainly explains four major contents for rsync filtering rules: configuration methods, internal operation methods, rule syntax and usage techniques. The contents are all from official manuals and practical tests. If there are any fallacies, everyone is welcome to criticize and correct them. Everyone is also welcome to communicate rsync's various experiences and skills in the comment area.
It should also be emphasized that this is not the entire content of rsync filtering rules. For example, the explanation of operations and modifiers is not complete, and some other filter-related options in the command (such as:--prune-empty-dirs
) It is not involved either. If you have more advanced features, please check the official manual.