Linux developers follow the philosophy of creating small programs that do one task and do it well. Take Linux text processing tools as an example, they are lightweight and have modular functionality. Even though these text manipulation tools differ in complexity and functionality, they come in handy in an environment where the graphical user interface isn’t available.
The article covers the best Linux tools to read files and use regular expressions to perform operations on the selected text. It also covers their most basic functionality and examples for better understanding.
1. grep
grep is a Linux text-manipulating utility that searches for a string of characters or patterns known as regular expressions in a file or text. The grep tool belongs to the family of utilities that include egrep, fgrep, and grep, among which fgrep is the fastest of all, while grep is the easiest.
The general syntax for using grep is as follows:
grep -options string filename
For example, to search for the word “root” in the /etc/passwd file:
grep root /etc/passwd
Some standard command-line examples to get started are:
Options | Example | Description |
---|---|---|
-c | grep -c <string> ./bashrc | Count the number of lines in which the string exists |
-i | grep -i <string> ./bashrc | Perform a case-insensitive search for the specified string |
-o | grep -o <string> file | Prints only the matched string |
-l | grep -l “passwd” | Prints file names in the current directory that match the pattern |
-n | grep -n <string> file | Prints the line number along the line containing the specified string |
string1|string2 | grep “string1|string2” file | Find and prints multiple strings from a file |
Similarly, you can use the ^ metacharacter with the grep command to display all the matching strings that begin with certain characters.
For instance, the following command pipes the env command output as an input to grep and displays variables that begin with “HO“:
env | grep ^HO
2. awk
awk is a powerful scripting language and a command-line text-manipulation tool that can perform line-by-line scans and compare lines to patterns. The basic syntax of the awk command is an action defined between a single quotation mark and curly braces followed by the filename.
awk '{action}' filename
awk '{pattern; action}' filename
The utility searches the file using regular expressions and performs the function defined in the action parameter. awk executes the script on every line if you do not set a pattern, as shown below:
awk '{print $1}' awk_examples.txt
…where $1 displays the first field of the awk_examples.txt file.
The following command performs the print function on the given pattern by replacing the second field “World” with “Alice,” and displays the whole line ($0):
echo "Hello World" | awk '{$2="Alice"; print $0}'
Output:
Hello Alice
Similarly, you can use the function print $0 from the command above to emulate the grep functionality.
awk '/john/{print $0}' /etc/passwd
john:x:1001:1001::/home/john:/bin/sh
3. sort
sort is another Linux command-line utility that helps you display the content of the specified text file in a sorted format. For instance, you can pipe the output of the awk command as an input to the sort utility as follows:
awk '{print $1}' awk_examples.txt | sort > sort_text.txt
cat sort_text.txt
Output:
4. sed
sed or stream editor takes input as a stream of characters and performs filtering and text transformations (delete, substitute, and replace) on the specified text.
You can use it in a script and edit files non-interactively. Hence, the most basic purpose of the utility is the substitution of string/characters. The general syntax is:
sed 's/string/substitution/option' file
Create a file using random sentences to practice and understand the working of this utility.
Let’s replace the occurrence of the word “two” on every line of the file with “2” using the -g flag for global replacement, as follows:
sed 's/two/2/g' sed_examples.txt > sed_examples2.txt
Similarly, use the -d flag to delete a specific line from the file:
sed '2d' sed_examples.txt
You can also replace the string by specifying a line number (4 s/two/2/p) and only printing the replaced line as follows:
sed -n '4 s/two/2/p' sed_examples2.txt
The -n flag in the command above disables the automatic printing of the input stream to the output. You can use this option in your favor for replacing the grep utility functionality with sed.
For instance, you can modify the command above by only including a regex pattern /two/p such that the -p flag will only print the lines to the standard output stream.
sed -n '/two/p' sed_examples2.txt
5. cut
The cut is another command-line utility that cuts/extracts parts of text from a line or file. It cuts the text based on a specified field, character, or byte position and pipes the result to the standard output.
The utility takes in the following syntax:
cut <options> file
Use the -b option to cut section or content using a specified byte or a range of bytes:
cut -b 1 cut_examples.txt
Use the -c flag to extract text by specifying the positions of characters:
cut -c 1,3,5 cut_examples.txt
Lastly, you can also extract text by specifying fields with the -f option and -d for space or field delimiter:
cut -d " " -f 1 cut_examples.txt
Here’s the list of ranges with examples and descriptions that you can utilize with the character -c and byte -b options:
Range | Example | Description |
---|---|---|
n- | cut -c 7- filename | Extract character from nth integer till the end of the line |
n-m | cut -b 7-15 filename | Extracts from integer n-m of each line from the input file |
-m | cut -c -7 filename | Extracts lines starting from m till the end of the line |
Note that you cannot define the ranges for text extraction by using the field -f option.
Manipulating Text With Linux Commands
Linux offers many programs and tools for handling and working around files or text. Learning them all might not be required as you can easily fill the gap with another once you have a good grip over one, like using sed as grep or awk as grep, but this can’t be true for every tool.
Besides, Linux commands have a steep learning curve but once you develop the skill, they can prove to be very useful and effective in the life of any Linux user, especially a system administrator.
Read Next
About The Author