When learning awk, you must remember to practice it manually. Only in practice can you find problems. The following is my experience in learning and practice, and summarize the differences and connections between RS, ORS, FS, and OFS.
1. RS and ORS
1. RS is the record separatorThe default separator is \n. Please see the specific usage.
[root@krlcgcms01 mytest]# cat test1 //Test file
111 222
333 444
555 666
2. RS default splitter\n
[root@krlcgcms01 mytest]# awk '{print $0}' test1 //awk 'BEGIN{RS="\n"}{print $0}' test1 These two are the same
111 222
333 444
555 666
In fact, you can understand the content in the above test1 file as, 111 222\n333 444\n555 6666, using \n for segmentation. See the next example
3. Customize RS splitter
[zhangy@localhost test]$ echo "111 222|333 444|555 666"|awk 'BEGIN{RS="|"}{print $0,RT}'
111 222 |
333 444 |
555 666
Based on the above example, it is easy to understand the usage of RS.
4. RS may also be a regular expression
[zhangy@localhost test]$ echo "111 222a333 444b555 666"|awk 'BEGIN{RS="[a-z]+"}{print $1,RS,RT}'
111 [a-z]+ a
333 [a-z]+ b
555 [a-z]+
From Examples 3 and 4, we can find that when RT is the content matched using RS. If RS is a fixed value, RT is the content of RS.
5. When RS is empty
[zhangy@localhost test]$ cat -n test2
1 111 222
2
3 333 444
4 333 444
5
6
7 555 666
[zhangy@localhost test]$ awk 'BEGIN{RS=""}{print $0}' test2
111 222
333 444
333 444
555 666
[zhangy@localhost test]$ awk 'BEGIN{RS="";}{print "<",$0,">"}' test2 //This example looks more obvious
< 111 222 >
< 333 444 //This line and the following line are one line
333 444 >
< 555 666 >
From this example, it can be seen that when RS is empty, awk will automatically use multiple lines as a splitter.
6. ORS record output character break, the default value is \n
Understand ORS as an RS counterprocess, so that it is easier to remember and understand. See the example below.
[zhangy@localhost test]$ awk 'BEGIN{ORS="\n"}{print $0}' test1 //awk '{print $0}' test1 Both are the same
111 222
333 444
555 666
[zhangy@localhost test]$ awk 'BEGIN{ORS="|"}{print $0}' test1
111 222|333 444|555 666|
2. FS and OFS
1. FS specifies the column splitter
[zhangy@localhost test]$ echo "111|222|333"|awk '{print $1}'
111|222|333
[zhangy@localhost test]$ echo "111|222|333"|awk 'BEGIN{FS="|"}{print $1}'
111
2. FS can also use regular
[zhangy@localhost test]$ echo "111||222|333"|awk 'BEGIN{FS="[|]+"}{print $1}'
111
3. When FS is empty
[zhangy@localhost test]$ echo "111|222|333"|awk 'BEGIN{FS=""}{NF++;print $0}'
1 1 1 | 2 2 2 | 3 3 3
When FS is empty, awk will treat each character in a row as a column.
4. When RS is set to non\n, \n will become one of the FS splitters
[zhangy@localhost test]$ cat test1
111 222
333 444
555 666
[zhangy@localhost test]$ awk 'BEGIN{RS="444";}{print $2,$3}' test1
222 333
666
There is a \n between 222 and 333. When RS is set to 444, 222 and 333 are identified as two columns in the same row. In fact, according to conventional thinking, it is only one column of two rows.
5. OFS column output separator
[zhangy@localhost test]$ awk 'BEGIN{OFS="|";}{print $1,$2}' test1
111|222
333|444
555|666
[zhangy@localhost test]$ awk 'BEGIN{OFS="|";}{print $1 OFS $2}' test1
111|222
333|444
555|666
There are only two columns in test1. If you have 100 columns, it will be too troublesome to write them out.
[zhangy@localhost test]$ awk 'BEGIN{OFS="|";}{print $0}' test1
111 222
333 444
555 666
[zhangy@localhost test]$ awk 'BEGIN{OFS="|";}{NF=NF;print $0}' test1
111|222
333|444
555|666
Why does OFS in the second method take effect? Personally, I think that when awk finds that the column has changed, OFS will take effect and output it directly without any changes.