introduce
I'll cover it because it's very easy to write when reading simple CSV files with Linux Bash scripts and processing them.
How to use the cut command
As a way to read a CSV file and process it using the Bash script you often see, there is a description of reading a CSV file line by line from standard input and storing the columns in a variable using the cut command.
Contents
#!/bin/bash while read line do # Save a line of text of the CSV file read in the $line line and save it in the variable columns through the cut command. col1=$(echo ${line} | cut -d , -f 1) col2=$(echo ${line} | cut -d , -f 2) col3=$(echo ${line} | cut -d , -f 3) # The processing content is described here # $colX Reference to the text of the read CSV file echo "col1:$col1 col2:$col2 col3:$col3" done < $1
Contents of csv file
$ cat a1,a2,a3 b1,b2,b3 c1,c2,c3 $
Execute scripts with csv file as parameter
$ ./ col1:a1 col2:a2 col3:a3 col1:b1 col2:b2 col3:b3 col1:c1 col2:c2 col3:c3 $
How to use IFS to store columns in variables
By changing the environment variable of the delimiter IFS to and setting multiple variables in the read command, you can give a simple description without using the cut command.
Contents
#!/bin/bash # Store a row of text in read CSV files in multiple variableswhile IFS=, read col1 col2 col3 do # The processing content is described here # $colX Reference to the text of the read CSV file echo "col1:$col1 col2:$col2 col3:$col3" done < $1
How to use IFS to store columns in an array (●)
You can also use the -a option of the read command to store the split column in an array.
#!/bin/bash while IFS=, read -a col do echo "col1:${col[0]} col2:${col[1]} col3:${col[2]}" done < $1
This method is the most recommended because it is an array, so it is easy to loop, add, delete, and process columns, and can be flexibly referenced and used variable extensions to show.
#!/bin/bash while IFS=, read -a col do for c in ${col[@]} do echo "loop:$c" done unset col[2] col+=(lastcol) echo "${col[@]}" echo "${col[@]:1}" echo "${col[@]/#/col:}" done < $1
$ ./ loop:a1 loop:a2 loop:a3 a1 a2 lastcol a2 lastcol col:a1 col:a2 col:lastcol loop:b1 loop:b2 loop:b3 b1 b2 lastcol b2 lastcol col:b1 col:b2 col:lastcol loop:c1 loop:c2 loop:c3 c1 c2 lastcol c2 lastcol col:c1 col:c2 col:lastcol $
For space or tab-delimited files
If the file delimiter is a space separator (SSV) or a tab separator (TSV), the environment variable IFS of the delimiter defaults to spaces, tabs, and newlines, so IFS is not specified in the script for processing. I can.
#!/bin/bash while read -a col do echo "col1:${col[0]} col2:${col[1]} col3:${col[2]}" done < $1
However, if the column is empty, it will omit consecutive spaces and tabs before and after and package the variable positions together.
$ cat a1 a2 a3 b1 b3 c3 $
$ ./ col1:a1 col2:a2 col3:a3 col1:b1 col2:b3 col3: col1:c3 col2: col3: $
To prevent this, you need to replace spaces and tabs with commas and read them into a comma-separated array.
#!/bin/bash IFS=, while read line do col=(${line// /,}) echo "col1:${col[0]} col2:${col[1]} col3:${col[2]}" done < $1
#!/bin/bash IFS=, while read line do col=(${line//$'\t'/,}) echo "col1:${col[0]} col2:${col[1]} col3:${col[2]}" done < $1
How to handle the CSV file to be read
If you want to read the CSV file in reverse order, replace a specific character, and then read it, you can process it before reading the CSV file by passing the result of the process replacement <() to standard input.
#!/bin/bash while IFS=, read -a col do echo "col1:${col[0]} col2:${col[1]} col3:${col[2]}" done < <(tac $1)
$ cat a1,a2,a3 b1,b2,b3 c1,c2,c3 $
$ ./ col1:c1 col2:c2 col3:c3 col1:b1 col2:b2 col3:b3 col1:a1 col2:a2 col3:a3 $
How to use the awk command
If you want to easily process or aggregate CSV files, it may be easier to handle with the awk command.
The awk command reads the file specified as a parameter line by line from the beginning, and automatically stores the contents separated by the delimiter in the variables $1, $2..., and can describe the contents to be processed line by line. When processing CSV files, you need to specify a comma as the separator in the -F option.
$ awk -F, '{print "col1:"$1,"col2:"$2,"col3:"$3}' col1:a1 col2:a2 col3:a3 col1:b1 col2:b2 col3:b3 col1:c1 col2:c2 col3:c3 $
Furthermore, if processing content becomes complicated, processing content can be described in a file as a script.
{ print "col1:"$1,"col2:"$2,"col3:"$3 }
$ awk -F, -f col1:a1 col2:a2 col3:a3 col1:b1 col2:b2 col3:b3 col1:c1 col2:c2 col3:c3 $
If the file separator is space-separated (SSV) or tab-separated (TSV), you don't need to specify options like a Bash script to read, but it's safer to specify options because the variable location is packaged and stored.
$ awk -F'[. ]' '{print "col1:"$1,"col2:"$2,"col3:"$3}' col1:a1 col2:a2 col3:a3 col1:b1 col2: col3:b3 col1: col2: col3:c3 $
$ awk -F'[.\t]' '{print "col1:"$1,"col2:"$2,"col3:"$3}' col1:a1 col2:a2 col3:a3 col1:b1 col2: col3:b3 col1: col2: col3:c3 $
The awk command is also a powerful command, but there is another way to use the perl command to handle CSV files, which is more intuitive and complex.
This is a very complex command that is included in a Linux server by default, so if you are interested, you can use it.
This is the article about how to use Bash to read and process CSV files. For more related Bash to read and process CSV content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!