Marquette University logo      

COSC 3250 / COEN 4820 Operating Systems
Midterm Examination 2
Wednesday, April 8, 2015

 

Your second midterm examination will appear here. It covers Chapters 6 - 12.

It will be a 50 minute, closed book exam.

Expectations:

Section and page number references in old exams refer to the edition of the text in use when that exam was given. You may have to do a little looking to map to the edition currently in use. Also, which chapters were covered on each exam varies a little from year to year.

Questions mat be similar to homework questions from the book. You will be given a choice of questions. I try to ask questions I think you might hear in a job interview. "I see you took a class on Operating Systems. What can you tell me about ...?"

Preparation hints:

Read the assigned chapters in the text. Chapter summaries are especially helpful. Chapters 1 and 2 include excellent summaries of each of the later chapters.

Read and consider Practice Exercises and Exercises at the end of each chapter.

Review authors' slides and study guides from the textbook web site

Review previous exams listed above.

See also Dr. Barnard's notes. He was using a different edition of our text, so chapter and section numbers do not match, but his chapter summaries are excellent.

Results on exam 1 suggest you may wish to maintain a vocabulary list.

Read and follow the directions:

  • Write on four of the five questions. If you write more than four questions, only the first four will be graded. You do not get extra credit for doing extra problems.
  • In the event this exam is interrupted (e.g., fire alarm or bomb threat), students will leave their papers on their desks and evacuate as instructed. The exam will not resume. Papers will be graded based on their current state.
  • Read the entire exam. If there is anything you do not understand about a question, please ask at once.
  • If you find a question ambiguous, begin your answer by stating clearly what interpretation of the question you intend to answer.
  • Begin your answer to each question at the top of a fresh sheet of paper [or -5].
  • Be sure your name is on each sheet of paper you hand in [or -5]. That is because I often separate your papers by problem number for grading and re-assemble to record and return.
  • Write only on one side of a sheet [or -5]. That is because I scan your exam papers, and the backs do not get scanned.
  • No electronic devices of any kind are permitted.
  • Be sure I can read what you write.
  • If I ask questions with parts
    1. . . .
    2. . . .
    your answer should show parts
    1. . . .
    2. . . .
  • Someone should have explained to you several years ago that "X is when ..." or "Y is where ..." are not suitable definitions (possibly excepting the cases in which X is a time or Y is a location). On Exam 2, "X is when ..." or "Y is where ..." earn -5 points.

 

Score Distribution:

Histogram of scores

 

Median: 78; Mean: 74.0; Standard Deviation: 17.8; Range: 14 - 100

These shaded block are intended to suggest solutions; they are not intended to be complete solutions.

Problem 1 Definitions
Problem 2 Tricky C - Deadlock
Problem 3 Hard Disk Scheduling
Problem 4 Large files
Problem 5 Memory Management

Problem 1 Definitions

Give careful definitions for each term (in the context of computer operating systems) (3 points each):

  1. Resource allocation graph
  2. Memory hierarchy
  3. Memory mapped files
  4. Page fault
  5. Page replacement
  6. Swapping
  7. RAID
  8. Sequential vs. direct file access
Specific questions were chosen from this collection:
  1. Active vs. Working set (in the context of memory management). Both concern memory addresses used by a program. The working set (Section 9.6.2) is the (fuzzy) set of pages the program is "naturally" using at the moment. The active set is the set of pages actually stored in memory at the moment. We hope that the active set contains the working set, more or less. When the working set is significantly larger than the active set, we get thrashing. However, I must cut you some slack, though, because our text makes no mention of the term "active set."
  2. Address binding (8.1.2) is the tying of a virtual address to a physical address. When do we know the physical address of a particular line of code of variable? Addresses can be bound at language design time, at language implementation time, at compile time, at load time, or during execution. Virtual memory is a technique for binding addresses during execution.
  3. Atomic execution [p. 210]: An instruction is implemented as one uninterruptable unit.
  4. Busy waiting [p. 213]. Waiting while holding the CPU, typically waiting by loop, doing nothing. For example
          while (TestAndSet(&lock) ;  // Does nothing
    A process in a Waiting state (in the sense of the Ready-Running-Waiting state diagram) is not busy waiting because it is not holding the CPU.
  5. Cooperating process [p. 203]: One that can affect or be affected by other processes executing in the system.
  6. CPU-I/O burst cycle [p. 262]: Processes typically alternate between periods of high CPU activity and periods of high I/O activity
  7. Critical-Section problem [p. 206] solution must provide mutual exclusion, must allow each process to make progress, and must ensure the time a process must wait to enter its critical section is bounded.
  8. Deadlock vs. starvation [p. 217]. Both concern indefinite wait. "A set of processes is in a deadlocked state when every process in the set is waiting for an event that can be caused only by another process in the set." Starvation is also an indefinite wait not satisfying the definition of deadlock, often due to an accidental sequence of events.
  9. Deadlock necessary conditions [p. 318]: Mutual exclusion, Hold and wait, No preemption, Circular wait.
    "Wait ON" vs. "Wait FOR." One waits ON superiors, to take care of, or to serve; e.g., a store clerk waits ON customers. One waits FOR peers, e.g., you might wait for a friend to go to the lab. A server process might be said to "wait on" a client process, but process scheduling, mutual exclusion, deadlock, etc., involve processes waiting FOR each other.
  10. Demand paging (9.2). Pages of virtual memory are loaded into physical memory only as they are needed.
  11. Error-correcting code [p. 479]: Encoding data is such a way that a certain number of errors can be detected and a (smaller) number of errors can be corrected. (Circular statements such as "Error-correcting codes are codes for correcting errors" received zero points.) Encryption codes protect against unauthorized access.
  12. File (p. 504). A named collection of related information recorded on secondary storage.
  13. File access: Sequential vs. direct. Sequential access (11.2.1) processes information in the file strictly in order. Direct access (11.2.2) can access records in a file in no particular order. "File access" concerns access to information within a file, not to a list of files. That would be accessing a directory. "File access" does not depend on all or a part of the file being in memory. Direct and sequential access concern data within a file, not one file within a collection of files.
  14. File-System mounting (11.4). The operating system places a file system (typically a disk or equivalent) at a specified location within the file system. For example, Windows mounts drives as letters in the MyComputer "folder." Did you ever wonder what Windows would do if you tried to mount more than 26 drives?
  15. Indexed file access, Similar to direct file access, except that an index is used to record pointers to blocks of a file. To find a record in the file, we search the index and use the pointer to access the file directly. Not the same as direct file access.
  16. Internal vs. external fragmentation [p. 363]. Both are wasted memory. Internal fragmentation is within a page, segment, partition, or disk block. External fragmentation is memory that is free, but it is in such small units that it cannot be allocated effectively. It is wasted memory outside a segment, partition, or disk block. Paging avoids external fragmentation.
  17. Locality [p. 428]. The concept that if a location in memory (or on disk) is referenced, the probability increases that nearby locations will be referenced soon. Locality makes the entire memory hierarchy effective.
  18. Logical address space vs. Physical address space (Sect. 8.1.3)
  19. Memory hierarchy Processor registers > cache levels 1, 2, and 3 > main memory > disk cache > local disk storage > LAN storage > remote storage > off-line storage > remote storage. Trade-offs between speed, capacity, volatility, and cost.
  20. Memory mapped files [p. 430]. Using virtual memory techniques to treat a sequentially accessed file as routine memory access.
  21. Monitor vs. semaphore (for mutual exclusion). See Section 5.8 vs. 5.6. Both are means to enforce mutual exclusion. "The monitor construct ensures that only one process at a time can be active within the monitor." In Java, the keyword synchronized enforces a monitor-like discipline. Counting or binary semaphores provide mutual exclusion through indivisible operations wait() and signal().
  22. NAS = Network-Attached Storage [p. 471]; SAN = Storage-Area Network [p. 472].
  23. Page fault [p. 403]. CPU attempts to access a memory location on a page that is not found in the page table.
  24. Page replacement (9.4): FIFO, Optimal, LRU, approximate LRU, Second-chance.
  25. Preemption - Taking resources away from a process that holds them. Preempting a running process by a timer interrupt is one example, but in principle, the operating system could preempt any resource. Preemption is more than interruption. Preempting a running process takes away the CPU resource, but not necessarily its other resources.
  26. Preemptive scheduling (6.1.3). In non-preemptive scheduling, CPU-scheduling, a process keeps the CPU until the process releases the CPU by terminating or by switching to a waiting state. In preemptive scheduling, the CPU can be taken away, most commonly in response to a timer interrupt. Preemptive scheduling concerns one resource, the CPU.
  27. Priority inversion [p. 217]: A high priority process is forced to wait for a lower priority process to release some resource(s).
  28. Processor affinity vs. load balancing (6.5.2). Both concern multi-processor scheduling. Processor affinity suggests a process should remain on the same processor to exploit caching. Load balancing suggests a process should be moved from an overloaded processor to a lightly loaded processor.
  29. Proportional share real-time CPU scheduling [p. 289]: Allocate T shares among all applications. An application can receive N shares of time, ensuring that the application is allocated N/T of the total processor time.
  30. Race condition [p. 205]: Several processes access and manipulate the same data concurrently, and the outcome of
  31. RAID [p. 484]. Redundant arrays of independent/inexpensive disks use to provide data redundancy and rapid access.
  32. Rate-monotonic real-time CPU scheduling [p. 287]: Upon entering the system, each periodic task is given a static priority inversely based on its period. If a lower priority process is running, and a higher priority process becomes available to run, the lower priority process is interrupted.
  33. the execution depends on the particular order in which the accesses take place.
  34. Readers-writers problem (5.7.2) is one of the classical problems of synchronization. Multiple reading processes and multiple writing processes need to share access to a single shared data resource.
  35. Resource allocation graph (7.2.2) is a graphical representation of processes i) requesting and ii) holding resources.
  36. Round-robin scheduling (6.3.4) is a preemptive schedule algorithms in which the Ready List is treated as a FIFO queue. Your answer needed to mention both the order in which processes are scheduled (FIFO) and preemption.
  37. Semaphore [p. 213]: An integer variable accessed only through the atomic operations wait() and signal().
  38. Swapping [p. 385]. Similar in concept to paging, except that the units are entire processes, rather than pages. "Swapping" concerns hich process is in memory, not which process has the CPU. (Well, OK. Yes, we do say "context swap" for changing which is the running process. Ambiguous answer; bad question.)
  39. Starvation [p. 217]: A process may wait indefinitely.
  40. SSTF = Shortest-seek-time-first [p. 474]; C-SCAN [p. 476].
  41. Thrashing [p. 4.25]. Behavior of a process that does not have "enough" page frames allocated to it, characterized by frequent page faults.

Problem 2 Tricky C - Deadlock

Consider the following program in C-like pseudo code proposed as a software solution to the mutual exclusion problem with two processes in a system with preemptive process scheduling:

 boolean blocked[2];
 int turn;
 void P (int id) {
    while (true) {
	   blocked[id] = true;
	   while (turn != id) {
	      while (blocked[1-id]) {
		     turn = id;
		  }
	   }
	   /*  CRITICAL SECTION  */
	   blocked[id] = false;
	   /*  Remainder  */
	}
 }
 void main () {
    blocked[0] = false;
    blocked[1] = false;
    turn = 0;
    ForkAndStartAsProcess (P(0));
    ForkAndStartAsProcess (P(1));
}

Find a counterexample that demonstrates that this solution is incorrect.

Stallings book Repeated from 2007B - 183.

This problem is taken from William Stallings, Operating Systems, Fourth Edition, problem 5.5, p. 257. It is a software solution to the mutual exclusion problem proposed by H. Hyman, Comments on a Problem in Concurrent Programming Control, Communications of the ACM, January 1966. Even CACM was fooled on this one.

To be "not correct," you must show that it fails to solve the mutual exclusion problem. You can show the failure of any one of

  1. Mutual exclusion is preserved
  2. The progress requirement is satisfied
  3. The bounded-waiting requirement is met
We will show that the bounded-waiting requirement is not met because a deadlock is possible.

 

    /* Process 0  */
    while (true) {
	   blocked[0] = true;
	   while (turn != 0) {
	      while (blocked[1]) {
		     turn = 0;
		  }
	   }
	   /*  CRITICAL SECTION  */
	   blocked[0] = false;
	   /*  Remainder  */
	}
    /* Process 1  */
    while (true) {
	   blocked[1] = true;
	   while (turn != 1) {
	      while (blocked[0]) {
		     turn = 1;
		  }
	   }
	   /*  CRITICAL SECTION  */
	   blocked[1] = false;
	   /*  Remainder  */
	}

Deadlock: Suppose Process 0 runs first. It is interrupted after its blocked[id=0] = true, and Process 1 runs. Process 1 would be caught in its while(blocked[1-id]) loop, and Process 0 would be caught in its while(turn != id) loop. Both processes are prevented from entering their critical section, but we have a deadlock.

 

    /* Process 0  */
    while (true) {
	   blocked[0] = true;
 
 
 
 
 
 
	   while (turn != 0) {
	      while (blocked[1]) {
		     turn = 0;
		  }
	   }  /* Caught here */
 
 
	   /*  CRITICAL SECTION  */
	   blocked[0] = false;
	   /*  Remainder  */
	}
    /* Process 1  */
 
 
    while (true) {
	   blocked[1] = true;
	   while (turn != 1) {
	      while (blocked[0]) {
		     turn = 1;
		  }  /* Caught here */
 
 
 
 
 
 
	   }
	   /*  CRITICAL SECTION  */
	   blocked[1] = false;
	   /*  Remainder  */
	}

An even stronger form of "incorrect" is to violate mutual exclusion.

Yes, one should check the values returned from fork(), but a) this is PSEUDO code, and b) poor programming style does not make a program incorrect.

Infinite loops are not problems in themselves; many processes run as infinite loops. MS Word must be an infinite loop; it sits there in an open window until you close it.

Problem 3 Scheduling Hard Disk Requests

  1. List five algorithms that might be used to schedule read/write requests on a modern rotating, mechanical hard disk.
  2. For each of your five algorithms in part a, describe the algorithm.
  3. For each of your five algorithms in part a, what are the advantages of the algorithm?
  4. For each of your five algorithms in part a, what are the disadvantages of the algorithm?
  5. How would you select the best one?
  6. In what ways is CPU process scheduling different from disk scheduling?
  7. For each of your five algorithms in part a, what would be the implications of using it to schedule read/write requests on a modern solid state drive?

Hint: Please organize your response according to the parts I have asked. Write acronyms out; avoid TLAs (Three Letter Abbreviations).

See Section 10.4. FCFS, SSTF, Scan, C-Scan, Look, C-Look.

In considering advantages and disadvantages, it helps to list important criteria, which might include average wait time, maximum wait time, requests serviced in unit time, "fairness", effort to implement. Then, you can rate each strategy on each criterion.

e. Simulation studies. Of course, you want the strategy with low average waiting time, high throughput, etc., but how do you know which one that is? Whenever you are tempted to say "better," "best," "optimal," etc., you always should be prepared to explain, "In what sense?"

f. One concerns CPU time, and the other concerns seek time, but similar algorithms work. The biggest difference is that Shortest Job First requires (usually) unknowable information, while Shortest Seek Time First is feasible.

g. Algorithms based on head position make no sense. SSD should be scheduled FCFS, except that the SSTF principle suggests reads be given priority over writes, except reads and writes to the same block should be executed in order, at least apparently. See box on p. 478. If the SSD controller has a queue of pending writes, and a read request arrives to read from a block for which there is a pending write, the read request can be satisfied using data in the pending write request with no accesses to the SSD at all. The fastest reads are the ones that don't happen.

Request scheduling concerns time; wear leveling concerns space - selection of disk blocks. Request scheduling, by itself, does not affect wear leveling.

Note to self: Harder to grade than expected.

Problem 4 Large files

Consider a UNIX-like inode file allocation strategy:

 

Fig. 12.9 UNIX inode structure
  1. Should we allocate 32 bits or 64 bits to store (cylinder, track, sector) disk addresses for a modern rotating mechanical drive?
  2. Explain.
  3. Suppose that
    1. Disk block is 1 K bytes
    2. Large rectangle on the left represents a head block containing 1 K bytes,
    3. Accounting information at the top of the head block occupies 16 64-bit words
    4. Head block includes one single indirect pointer, one double indirect pointer, and one triple indirect pointer occupying one 64-bit word each
    5. All remaining space in the head block is used for direct block pointers, each occupying 64 bits
    6. Other pointer and data block in this figure represent 1 Kb disk blocks
    7. Disk addresses stored in pointer blocks each occupy 64 bits
    Can we store a 20 gigabyte database file using this file storage structure?

     

  4. Explain.
  5. What should I do?
  6. [Next year?] Data Recovery: Suppose one disk block used to store my 20 Gb database is completely unreadable. What data can I recover? Explain. What are implications for the internal file organization used by the database engine?
While the details fall short in this example, this concept of organizing files with increasing levels of indirection is the basis for most modern file systems.

Quantify. Scientists and engineers work with NUMBERS. Leave the qualitative descriptions to artists.

  1. 64. For now.

     

  2. I just bought a 4 Tb external drive. That is about 242 bytes. If we assume a 1 Kb (210 byte) sector, we need to address 232 [cylinder, track, sector] addresses. If we allocate 32 bits JUST RIGHT to cylinder, track, and sector addresses, we fit perfectly. If disk blocks are 512 bytes, 32 bits are not enough; if disk blocks are 2 K bytes, we might have a bit to spare. Point: TODAY's disks are in the range at which 32 bits are not enough. To plan for the future, we should use AT LEAST 64 bits for (cylinder, track, sector) addresses, which can address roughly tera, tera-byte disks. That would represent a squaring of disk capacities, compared with today's disks.

    Will you live to see tera, tera-byte disks? The first computer I bought in about 1980 had 2 128 Kb 5.25" floppy drives, for a storage capacity of 218 bytes. The cost, weight, physical volume, and power consumption were each within a factor of 2 of my new 4 Tb drive. 4 Tb is (218)2.33, and the pace of technology is INCREASING.

    My forecast: Your kids will bring tera, tera-byte disks to Dennis's Operating Systems classes. Is THAT scary?

     

  3. No. Not quite.

     

  4. 20 Gb is about 232 bytes.

    The head block uses 19 64-bit words (152 bytes), leaving 872 bytes for 109 direct pointers, each addressing a 1 Kb block. That's 109 Kb, which I'll round to 128 Kb = 217 bytes.

    The single indirect pointer points to a block with room to store 128 pointers, each addressing a 1 Kb block. That's 128 Kb more, for a total of 218 bytes so far.

    The double indirect pointer points to a block with room to store 128 pointers, which in turn point to a block with room to store 128 pointers, each addressing a 1 Kb block. That's 128 x 128 Kb more, for a total of 224 + 218 bytes so far. I hope you can see that the 218 does not really matter.

    The triple indirect pointer points to a block with room to store 128 pointers, which in turn point to a block with room to store 128 pointers, which in turn point to a block with room to store 128 pointers, each addressing a 1 Kb block. That's 128 x 128 x 128 Kb more, for a total of 231 + 224 + 218 bytes < 232.

    We only fall short by a factor of two. You cannot know that without working out the details. If you guess from qualitative reasoning, you have a good chance of guessing wrong.

     

  5. We could use more levels of indirection, or we could increase the disk block size. With 1 Kb disk blocks:

     

    Levels of
    Indirection
     Approximate
    max file size
    Direct - 0 217 = 128 Kb
    1 218 = 256 Kb
    2 224 = 16 Mb
    3 231 = 2 Gb
    4 238 = 256 Gb
    5 245 = 32 Tb
    n 27*n+10

     

    With 4 Kb disk blocks:

     

    Levels of
    Indirection
     Approximate
    max file size
    0 221 = 2 Mb
    1 222 = 4 Mb
    2 230 = 1 Gb
    3 239 = 512 Gb
    4 248 = 256 Tb
    5 257 = 128 Pb
    n 29*n+12

     

    Yes, increasing the block size is attractive, but beware. If we increase the block size, we should increase the page frame size to match, which requires a complete paging system redesign (and its hardware support).

    Virtual memory may offer a strategy for accessing a large file (that's memory mapping, Problem 1 c), but virtual memory is not a strategy for storing a large file.

    "Store it in the cloud" is no solution because in the cloud, it is still a file, almost certainly stored in some variant of the UNIX inode file structure. Besides, you SAW the 20 Gb file stored on my local drive.

Problem 5 Memory Management

Suppose you are managing one Mb of contiguous storage. For the purposes of this problem, you may assume 1 Kb means 210, and 1 Mb means 220. Your memory manager begins in the state with Block 0 holding 400 Kb and Block 1 holding 100 Kb at the locations shown in the table below. Your memory manager receives the following sequence of pseudo code malloc and free requests:

    1. Block 2 = malloc (200 Kb);
    2. Block 3 = malloc (100 Kb);
    3. free (Block 2);
    4. free (Block 0);
    5. Block 4 = malloc (200 Kb);
    6. Block 5 = malloc (200 Kb);
    7. Block 6 = malloc (400 Kb);
    8. free (Block 1);
    9. free (Block 4);
    10. Block 7 = malloc (100 Kb);
    11. free (Block 5);
    12. free (Block 3);
    13. Block 8 = malloc (100 Kb);
    14. free (Block 6);
    15. Block 9 = malloc (500 Kb);

Assume that you are using a worst fit algorithm, with ties broken in favor of lower memory addresses.

Complete the following table for requests 4 - 15, showing where each 100 Kb block is placed in memory:

 

  Start  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
100Kb     B3 B3                        
200Kb                                
300Kb Block 0 B0 B0 B0                        
400Kb                        
500Kb                        
600Kb                        
700Kb   B2 B2                          
800Kb                            
900Kb                                
1 Mb Block 1 B1 B1 B1                        

 

(Colors in this figure have no meaning. They are only to help you see where each block goes.)

List any other assumptions you make.

What do you conclude from this exercise?

Repeated from 2007B - 183.

Assume adjacent free blocks are collected into one.
Assume we do not do defragmentation (move blocks around to combine small free blocks).

Here is what I get:

 

  Start  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
100Kb     B3 B3 B3 B3 B3 B3 B3 B3 B3 B3        
200Kb           B4 B4 B4 B4   B7 B7 B7 B7 B7 B7
300Kb Block 0 B0 B0 B0           B8 B8 B8
400Kb     B5 B5 B5 B5 B5         B9
500Kb            
600Kb       B6 B6 B6 B6 B6 B6 B6  
700Kb   B2 B2          
800Kb            
900Kb                  
1 Mb Block 1 B1 B1 B1 B1 B1 B1 B1                

This problem can become a mess if you make a mistake early.

For comparison, let's try First Fit:

 

  Start  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
100Kb   B2 B2       B5 B5 B5 B5 B5         B9
200Kb                
300Kb Block 0 B0 B0 B0       B6 B6 B6 B6 B6 B6 B6  
400Kb        
500Kb        
600Kb          
700Kb     B3 B3 B3 B3 B3 B3 B3 B3 B3 B3   B8 B8 B8
800Kb           B4 B4 B4 B4   B7 B7 B7 B7 B7 B7
900Kb                        
1 Mb Block 1 B1 B1 B1 B1 B1 B1 B1                

Best Fit appears to give similar results, but the average size of the largest free block is lightly larger for Worst Fit. From these charts, I could create a slightly different sequence of requests that Worst Fit could handle, but Best Fit could not.

 

 
  Marquette University. Be The Difference. Marquette | Corliss |