curl -O -s https://www.labs.cs.uregina.ca/330/Pipe/Lab8.zip unzip Lab8.zip
The purpose of this lab is to introduce you to a way that you can construct powerful Unix commands by chaining together several Unix commands.
Unix commands alone are powerful, but when you combine them together, you can accomplish complex tasks easily. One way you can combine Unix commands is through using pipes and filters on the command line.
The symbol | is the Unix pipe symbol that is used on the command line.
What it means is that the standard output of the command to the left of the pipe gets sent as standard input of the command to the right of the pipe.
Example 1:There is a program (or UNIX command) in UNIX which reports who is logged
onto the system: who
If I wanted to print out the list of who is on my system, I would type:
$ who | lpr -Pcl115
The "|" is a pipe, and this type of pipe sends the stream of data to another program, in this case, a program called lpr which sends all incoming data to the printer in CL115.
Example 2:$ cat weather.txt input string shell signal $ cat weather.txt | wc 3 4 26
In this example, at the first shell prompt, the contents of the file weather.txt are displayed.
In the next shell prompt, the cat command is used to display the contents of the weather.txt file, but the display is not sent to the screen; it goes through a pipe to the wc (word count) command.
The wc command then does its job and counts the lines, words, and characters of what it got as input.
If I wanted to store the information from the who command in a file I could redirect standard output to a file
$ who > current_users
Another type of redirection takes data from a file and puts it into a program (as standard input):
$ grep "smithp" < current_users
Here the contents of the file current_users is given to a program call "grep" which filters out lines in its input that contain a particular string of characters.
A filter is a Unix command that does some manipulation of the text of a file. In this section, we will talk about three popular Unix filters are sed, awk, and grep.
$ cat weather.txt input string shell signal $ cat weather.txt | sed -e "s/string/signal/g" input signal shell signal $ cat weather.txt | sed -e "s/i/WWW/" WWWnput strWWWng shell sWWWgnal $
In this example, the first shell prompt displays the contents of the weather.txt file.
The second shell prompt, uses the cat command to display the contents of the weather.txt file, and sends that display through a pipe to the sed command.
The third shell prompt, uses the cat command on the weather.txt file and pipes the output to the sed command to change the first occurrence of an "i" on each line to "WWW".
It is important to note that, in this example, the contents of the weather.txt file itself were not changed in the file. Only the display of its contents changed.
The Unix command awk is another powerful filter. You can use awk to manipulate the contents of a file.
Here is an example:$ cat basket.txt Layer1 = cloth Layer2 = strawberries Layer3 = fish Layer4 = chocolate Layer5 = punch cards $ cat basket.txt | awk -F= '{print $1}' Layer1 Layer2 Layer3 Layer4 Layer5 $ cat basket.txt | awk -F= '{print "HAS: " $2}' HAS: cloth HAS: strawberries HAS: fish HAS: chocolate HAS: punch cards $
$ cat apple.txt core worm seed jewel $ grep jewel apple.txt jewel $
The main system calls that will be needed for this lab are:
First, we will talk a little bit about this concept of pipes:
Think of a pipe as a special file that can store a limited amount of data in a first-in-first-out (FIFO) manner.
There are two kinds of pipes:
Unnamed pipes can only be used with related processes (eg. parent/child, or child/child) and exists only as long as the process using them.
Named pipes exist as directory entries that have file access permissions. They can, therefore, be used with unrelated processes.
These notes will focus on the unnamed pipes, which use the pipe system call.
First we will review the read(), write(), and close() system calls:
#include <unistd.h> ssize_t read(int fildes, void *buf, size_t nbyte);
Data is read from the pipe using the unbuffered I/O read() system call.
The read() system call will read nbytes from the open file associated with the file descriptor filedes into the buffer referenced by buf.
If the read call is successful the number of bytes actually read is returned.
NOTE* All reads are initiated from the current position (i.e. no seeking supported)
#include <unistd.h> ssize_t write(int fildes, const void *buf, size_t nbyte);
Data is written to the pipe using the unbuffered I/O write() system call.
Using the file descriptor specified by filedes, the write() system call will attempt to write nbyte bytes from the buffer referenced by buf.
#include <unistd.h> int close(int fildes);
close() closes the file indicated by the file descriptor fildes.
An unnamed pipe is constructed using the pipe system call.
#include <unistd.h> int pipe(int filedes[2]);
If successful, the pipe system call creates a pair of file descriptors, pointing to a pipe inode, and places them in the array pointed to by filedes.
The file descriptors reference two data streams.
Example (halfpipe.cpp):
#include <stdio.h> #include <unistd.h> #include <stdlib.h> #include <string.h> #define BUFSIZE 50 main(int argc, char *argv[]) { int f_des[2]; static char message[BUFSIZE]; // Print usage if wrong number of arguments if (argc!=2) { fprintf(stderr, "Usage: %s message\n", *argv); exit(1); } // Open a pipe and report error if it fails if (pipe(f_des)==-1) { perror("Pipe"); exit(2); } // Use switch for fork, because parent doesn't need child's pid. switch (fork()) { case -1: // Error perror("Fork"); exit(3); case 0: // Child //Close pipe out and read from pipe. Report errors if any. close(f_des[1]); if (read(f_des[0], message, BUFSIZE)!=-1) { printf("Message received by child: [%s]\n", message); fflush(stdout); } else { perror("Read"); exit(4); } break; default: // Parent //Close pipe in and write to pipe. Report errors if any. close(f_des[0]); if (write(f_des[1], argv[1], strlen(argv[1])) !=-1) { printf("Message sent by parent: [%s]\n", argv[1]); fflush(stdout); } else { perror("Write"); exit(5); } } exit (0); }
Sample Run:
% a.out HELLO Message sent by parent: [HELLO] Message received by child: [HELLO]
In the parent process, the pipe file descriptor f_des[0] is closed and the message (the string referenced by argv[1]) is written to the pipe file descriptor f_des[1].
In the child process, the pipe file descriptor f_des[1] is closed and pipe file descriptor f_des[0] is read to obtain the message.
While the closing of the unused pipe file descriptors is not required, it is good practice.
Remember that for read to be successful the number of bytes requested must be present in the pipe or all the write file descriptors for the pipe must be closed so that an end-of-file can be returned.
The pipe file descriptors f_des[0] in the child and f_des[1] in the parent will be closed when each process exits.
Sometimes we may want to "tie" standard output and/or input to either end of the pipe. This is so that we can emulate things such as:
%last | sort
To do that, we can use the dup2 system call.
#include <unistd.h> int dup2(int fd1, int fd2);
After successful return of dup or dup2, the [file descriptors (fd1 and fd2)] may be used interchangeably. They share locks, file position pointers and flags; for example, if the file position is modified by using lseek on one of the descriptors, the position is also changed for the other. (modified from the Linux man pages)
dup2 copies file descriptor table entries from fd1 to fd2, closing the fd2 entry first if necessary.
Example (pipeline.cpp):
#include <stdio.h> #include <unistd.h> #include <stdlib.h> main (void) { int f_des[2]; if (pipe(f_des)==-1) { perror("Pipe"); exit(1); } switch (fork()) { case -1: perror("Fork"); exit(2); case 0: /*In the child*/ dup2(f_des[1], fileno(stdout)); close(f_des[0]); close(f_des[1]); execlp("last", "last", NULL); exit(3); default: /*In the parent*/ dup2(f_des[0], fileno(stdin)); close(f_des[0]); close(f_des[1]); execlp("sort", "sort", NULL); exit(4); } }
For completeness, named pipes are mentioned here. The following paragraph is taken from pg 132 of Interprocess Communication in UNIX:
UNIX provides for a second type of pipe called a named pipe or FIFO (we will use the terms interchangeably). Named pipes are similar in spirit to unnamed pipes but have additional benefits. When generated, named pipes have a directory entry. With the directory entry are file access permissions and the capability for unrelated processes to use the pipe file. Named pipes can be created at the shell level (on the command line) or within a program.
Example:
[1]% mknod PIPE p [2]% ls -l PIPE
prw------- 1 me csfac 0 Oct 21 10:16 PIPE [3]% cat lab7.txt >> PIPE & [1] 9735 [4]% cat < PIPE
You can create unnamed pipes in your program using the mknod() system call. See man 2 mknod for more details.