One student writes:
In regards to the fork() call, I am confused. My understanding is that the
fork() call creates a new process that is a duplicate of the process that
called it. The book seems unclear, and your answer to a student's question
uncertain of my understanding.
I don't understand the purpose of forking a process. I can understand
duplicating part of a process, but if you duplicate the entire process, you
will duplicate infinitely (as you may have suggested in a lecture).
The difference is the point at which control returns to the process.
Consider. You have a process P1 which includes the code
ret_value = fork();
if (ret_value == 0) . . .
"Process" includes a number of attributes, including a memory block, a
process ID, values of CPU registers, value of program counter, etc. That
process issues a function call, in the normal programming language sense of
a function. It calls the function fork(), and control will return to the
point of the call, like any other function.
Inside the code for the function fork(), we (the OS) makes a complete copy
of P1, and we call it P2. P2 is a COPY, identical in every respect except
that it has a different PID and (of course as a copy) it is stored in a
different block of memory.
Draw a picture.
We have no way to know whether P1 or P2 will get to execute first. Just for
the sake of discussion, let's assume the child process P2 gets to execute
first. When P2 gets the CPU, what does it do? Like any other process
getting the CPU, it loads its program counter and begins executing the
instructions to which the program counter points. Where is that? Probably
at the end of the OS code for the function fork(). So it executes a
function return, returning a value, as a good function does. What value
does it return? The OS has set things up so in the child process P2, the
fork() code returns zero, so the child can know it is the child. To where
does it return? To the point of call, where the value zero is stored as
ret_value, and the process P2 goes merrily on its way. In particular, it
does NOT call fork() again, unless it is programmed in that way.
How about P1? Eventually, P1 gets the CPU and runs. When P1 gets the CPU,
what does it do? Like any other process getting the CPU, it loads its
program counter and begins executing the instructions to which the program
counter points. Where is that? Probably at the end of the OS code for the
function fork(). So P1 executes a function return, returning a value, as a
good function does. What value does it return? The OS has set things up so
in the parent process P1, the fork() code returns the process ID of the
child P2. To where does it return? To the point of call, where the value
zero is stored as ret_value, and the process P1 goes merrily on its way.
Sometimes, we would program P1 to wait for P2 to finish (as in a command
shell). Sometimes, P1 loops and forks another child.
If P1 forks multiple children, and if it cares to know about those children,
it needs to store the values returned from each fork() call into distinct
array locations.
Further
I don't see why fork() would ever reveal that the child process is running.
The parent process called the "child", so the parent process is running,
right?
fork() reveals to the parent either
the PID of the child (implying that the child existed,
at least momentarily)
OR
an error code (usually -1), implying that the fork() call
failed for some reason, e.g., OS's process table is
full
It could be that
ret_value = fork();
if (ret_value == 0) { /* child */
exit(0);
}
/* parent */
Then, the child was created, the parent gets its PID, but the child dies
before the parent has an opportunity to do anything.
Is the parent running? Consider
ret_value = fork();
if (ret_value == 0) { /* child */
do lots of interesting work
} else { /* parent */
exit(0);
}
The child may conclude that it HAD a parent, but should be wary of assuming
the parent continues to exist or that the parent pays any attention to its
child.
In may operating systems, this code would cause the operating system to kill
the child when the parent executes, but that is not always the case.
Consider also the main process of your program. Who is ITS parent? Usually
a command shell program of the operating system. Who is the parent of the
command shell? ... Clearly, at operating system start-up, there must be an
Adam process (usually called init) who is the original parent.
These processes are not human with feelings and connections, nor are they
magic. They are code, often relatively simple code. They behave as they
are coded.
|