SoFunction
Updated on 2025-04-09

How to use cut for text extraction in Linux

Introduction

LinuxIn-housecutCommands are command line utility that extract parts of lines of text from files or standard input. It is useful when you want to extract a specific field or column from a file or data stream, for example when dealing with a comma-separated or tab-separated file.

Basic syntax

cutThe command works by specifying a delimiter (such as a space, tab, or specific character) and selecting the column or field you want to display

cut OPTION... [FILE]...

Common options

  • -b, --bytes=LIST: Select by specifying a byte, a group of bytes, or a range of bytes

  • -c, --characters=LIST: Select by specifying a character, a group of characters, or a range of characters

  • -d, --delimiter=DELIM: Specifies the delimiter that will be used instead of the default "TAB" delimiter

  • -f, --fields=LIST: Select only these fields; print any rows that do not contain delimiters unless the -s option is specified

  • --complement: Supplementary selection. When using this option, cut displays all bytes, characters, or fields except for the selected content

  • -s, --only-delimited: No line containing separators is printed

  • --output-delimiter=STRING:cut's default behavior is to use the input delimiter as the output delimiter. This option allows specifying different output separator strings

Range selection

  • N: Nth byte, character or field, counting starting from 1

  • N-: From the Nth byte, character or field to the end of the line

  • N-M: Bytes, characters or fields from Nth to Mth (inclusive)

  • -M: From the first to the Mth (inclusive) bytes, characters or fields

Example usage

-f: Field selection

This option specifies which fields to extract. Fields are separated by delimiters (usually tabs or spaces, but can be used-dOptions specify any delimiter).

Example: To extract the first and third columns from the file

cut -f 1,3 filename

-d: separator

This option specifies the delimiter for the delimiting field. By default,cutAssume that fields are separated by tab characters, but other delimiters such as commas, colons, or spaces can be specified

Example: To extract fields from comma-separated file (CSV)

csv file

Name,Age,Location
Alice,30,New York
Bob,25,Los Angeles
Charlie,35,Boston
cut -d ',' -f 1,3 filename

Sample output

Name,Location
Alice,New York
Bob,Los Angeles
Charlie,Boston

-c: Character selection

This option allows for extracting specific characters from each line. You can specify the character position (or character range) to be extracted

Example: Extract characters from 1 to 5 per line

cut -c 1-5 filename

-b: Byte selection

This option allows truncating input based on bytes rather than characters. This feature is useful when processing byte-oriented data such as binary files.

cut -b 1-5 filename

--complement: reverse selection

This option allows for supplemental selection, meaning that instead of selecting the specified field, it excludes it

Example: Exclude the first column (field) and display the rest

cut -f 1 --complement filename

-s: Use of lines without separators is prohibited

This option hides lines that do not contain delimiters. This option is useful if you want to exclude lines that are missing delimiters

Example: Extract fields from file and ignore lines without delimiters

cut -d ',' -f 1 -s filename

Extract specific characters

Have a string and want to extract the first 3 characters

echo "abcdefg" | cut -c 1-3

Output

abc

Extract multiple character ranges

To extract multiple ranges of characters (for example, characters 1-3 and 6-8)

echo "abcdefg" | cut -c 1-3,6-8

Output

abcfg

List processes using cut and ps

You can use cut to extract specific information from the ps command output

For example: commands to extract process ID and running process

ps aux | cut -d ' ' -f 1,11

Exclude fields with --complement

To get frompasswdExclude the first field (user name) in the file

cut -d ':' -f 1 --complement /etc/passwd

Extract specific columns from the output of ls

This command lists files and directories, but outputs only their names (column 9 in ls -l output)

ls -l | cut -d ' ' -f 9

Get the disk usage of files in the current directory

This will output only the size of each file or directory, excluding path information

du -h | cut -f 1

This is the end of this article about how to use cut for text extraction in Linux. For more related Linux cut text extraction content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!