The difference and connection summary of RS, ORS, FS, and OFS in awk

When learning awk, you must remember to practice it manually. Only in practice can you find problems. The following is my experience in learning and practice, and summarize the differences and connections between RS, ORS, FS, and OFS.

1. RS and ORS

1. RS is the record separatorThe default separator is \n. Please see the specific usage.

Copy the codeThe code is as follows:

[root@krlcgcms01 mytest]# cat test1     //Test file

 111 222

 333 444

 555 666

2. RS default splitter\n

Copy the codeThe code is as follows:

[root@krlcgcms01 mytest]# awk '{print $0}' test1  //awk 'BEGIN{RS="\n"}{print $0}' test1 These two are the same

222

444

666

In fact, you can understand the content in the above test1 file as, 111 222\n333 444\n555 6666, using \n for segmentation. See the next example

3. Customize RS splitter

Copy the codeThe code is as follows:

[zhangy@localhost test]$ echo "111 222|333 444|555 666"|awk 'BEGIN{RS="|"}{print $0,RT}'

 111 222 |

 333 444 |

 555 666

Based on the above example, it is easy to understand the usage of RS.

4. RS may also be a regular expression

Copy the codeThe code is as follows:

[zhangy@localhost test]$ echo "111 222a333 444b555 666"|awk 'BEGIN{RS="[a-z]+"}{print $1,RS,RT}'

 111 [a-z]+ a

 333 [a-z]+ b

 555 [a-z]+

From Examples 3 and 4, we can find that when RT is the content matched using RS. If RS is a fixed value, RT is the content of RS.

5. When RS is empty

Copy the codeThe code is as follows:

[zhangy@localhost test]$ cat -n test2

 1  111 222

 2

 3  333 444

 4  333 444

 5

 6

 7  555 666

[zhangy@localhost test]$ awk 'BEGIN{RS=""}{print $0}' test2

111 222

333 444

333 444

555 666
[zhangy@localhost test]$ awk 'BEGIN{RS="";}{print "<",$0,">"}' test2 //This example looks more obvious

< 111 222 >
< 333 444     //This line and the following line are one line

333 444 >

< 555 666 >

From this example, it can be seen that when RS is empty, awk will automatically use multiple lines as a splitter.

6. ORS record output character break, the default value is \n

Understand ORS as an RS counterprocess, so that it is easier to remember and understand. See the example below.

Copy the codeThe code is as follows:

[zhangy@localhost test]$ awk 'BEGIN{ORS="\n"}{print $0}' test1  //awk '{print $0}' test1 Both are the same

111 222

333 444

555 666

[zhangy@localhost test]$ awk 'BEGIN{ORS="|"}{print $0}' test1

111 222|333 444|555 666|

2. FS and OFS

1. FS specifies the column splitter

Copy the codeThe code is as follows:

[zhangy@localhost test]$ echo "111|222|333"|awk '{print $1}'

 111|222|333

[zhangy@localhost test]$ echo "111|222|333"|awk 'BEGIN{FS="|"}{print $1}'

 111

2. FS can also use regular

Copy the codeThe code is as follows:

[zhangy@localhost test]$ echo "111||222|333"|awk 'BEGIN{FS="[|]+"}{print $1}'

111

3. When FS is empty

Copy the codeThe code is as follows:

[zhangy@localhost test]$ echo "111|222|333"|awk 'BEGIN{FS=""}{NF++;print $0}'

1 1 1 | 2 2 2 | 3 3 3

When FS is empty, awk will treat each character in a row as a column.

4. When RS is set to non\n, \n will become one of the FS splitters

Copy the codeThe code is as follows:

[zhangy@localhost test]$ cat test1

 111 222

 333 444

 555 666

[zhangy@localhost test]$ awk 'BEGIN{RS="444";}{print $2,$3}' test1

 222 333

 666

There is a \n between 222 and 333. When RS is set to 444, 222 and 333 are identified as two columns in the same row. In fact, according to conventional thinking, it is only one column of two rows.

5. OFS column output separator

Copy the codeThe code is as follows:

[zhangy@localhost test]$ awk 'BEGIN{OFS="|";}{print $1,$2}' test1

 111|222

 333|444

 555|666

[zhangy@localhost test]$ awk 'BEGIN{OFS="|";}{print $1 OFS $2}' test1

 111|222

 333|444

 555|666

There are only two columns in test1. If you have 100 columns, it will be too troublesome to write them out.

Copy the codeThe code is as follows:

[zhangy@localhost test]$ awk 'BEGIN{OFS="|";}{print $0}' test1

 111 222

 333 444

 555 666

[zhangy@localhost test]$ awk 'BEGIN{OFS="|";}{NF=NF;print $0}' test1

 111|222

 333|444

 555|666

Why does OFS in the second method take effect? Personally, I think that when awk finds that the column has changed, OFS will take effect and output it directly without any changes.