Final Examination

Your final examination will appear here.

It will be a two hour, closed book exam. It covers Chapters 1 - 12, 16, and 17. It is a comprehensive exam, with at least half the questions coming from material since the second midterm exam.

Expectations:

See course goals and objectives
See objectives at the beginning of each chapter
See Summary at the end of each chapter
See previous exams:
- midterm 2000,
- midterm #1 2005,
- midterm #2 2005,
- midterm #1 2007,
- midterm #2 2007,
- final A 2000,
- final B 2000,
- final 2005,
- final 2007

See also Dr. Barnard's notes. He was using a different edition of our text, so chapter and section numbers do not match, but his chapter summaries are excellent.

In the Table of Contents or the Index to the text, can you define/explain each term?

On the final exam, any definition of the form "ZZZ is when ..." looses half the available points, unless ZZZ refers to a time.

Directions:

Read and follow the directions.

Write on five of the six questions. Each question is worth 20 points. If you write more than five questions, only the first five will be graded. You do not get extra credit for doing extra problems.
Read the entire exam. If there is anything you do not understand about a question, please ask at once.
If you find a question ambiguous, begin your answer by stating clearly what interpretation of the question you intend to answer.
Begin your answer to each question at the top of a fresh sheet of paper. [Or -5]
Be sure I can read what you write.
Write only on one side of a sheet.
Be sure your name is on each sheet of paper you hand in. [Or -5]

Score Distribution:

Problem 1

Give careful definitions for each term:

Operating system
Thread
Deadlock
Atomic instruction
Directory
Direct file access
Working set size
RAID
Remote procedure call
Protection

Warning: "Z is when ..." earns -5 points.

Operating system (p. 3) A program that manages the computer hardware, provides a basis for application programs, and acts as an intermediary between the computer user and the computer hardware.

Thread (p. 127) Basic unit of CPU utilization; a thread ID, a program counter, a register set, and a stack.

Deadlock (p. 204) Two or more processes are waiting indefinitely for an event that can be caused only by one of the waiting processes.

Atomic instruction (p. 198) Executes as a unit. Either completes without interruption or has no effect.

Directory (see p. 386) A named collection of files that records information such as name, location, size, type for all files in that directory.

Direct file access (p. 383) A file is viewed as a numbered sequence of fixed-length blocks or records. Direct access refers to access to records within the file, not to files as units.

Working set size (p. 346) Set of pages in the most recent page references.

RAID (p. 468) Redundant Arrays of Inexpensive (or Independent) Disks

Remote procedure call (p. 825) A client-server mechanism that enables an application on one machine to call a procedure on another machine.

Protection (p. 531) Mechanisms for controlling the access for programs, processes, or users to the resources of a computer system. Protection is an internal problem. Protection != security.

Problem 2 Disk Scheduling

Suppose we have a disk drive with the following characteristics:

Platters:	1	Rotational speed:	100 rev./sec.
Tracks:	3000	Seek speed:	5 x 10⁴ tracks/sec.
Sectors:	1000	Data transfer rate:	10⁸ bits/sec.
Bytes per block:	4048

Assume that at time t = 0, the disk controller has received the following requests in the order shown.

Request # Track # Sector # Read/Write

1: 2222 1 Read

2: 45 2 Read

3: 2863 3 Write

4: 1234 4 Write

5: 1234 3 Read

6: 1234 1 Read

7: 510 432 Write

Assume no additional requests are received while these requests are served.

At what time, in seconds, does the disk complete service of these seven requests if the disk controller schedules these requests by

FCFS
SSTF
SCAN

You will need to make additional assumptions. State your assumptions explicitly.

See Section 12.4

Let's simplify this by introducing some variables. Let

x = time to read one sector
= 10^-5 sec.

y = time to seek one track
= 2 x 10^-5 sec.

Hence, we count in units of 10^-5 sec. and convert to seconds when we are done.

We assume

Head starts at track 2222, sector 0 (Unlikely, but it must start somewhere)

While we seek, the disk is rotating, so we must determine at which track the head is when seek gets to the correct track.

SCAN goes to track 2999 and starts scanning from there

Spreadsheet I used

FCFS: 0.19 sec.

Request # Track # Sector # Seek time Latency Complete

1: 2222 1 0 1 2

2: 45 2 4354 646 5003

3: 2863 3 5636 364 11004

4: 1234 4 3258 742 15005

5: 1234 3 0 998 16004

6: 1234 1 0 997 17002

7: 510 432 1148 982 19433

SSTF: 0.12 sec.

Request # Track # Sector # Seek time Latency Complete

1: 2222 1 0 1 2

3: 2863 3 1282 719 2003

4: 1234 4 3258 743 6004

5: 1234 3 0 999 7003

6: 1234 1 0 998 8001

7: 510 432 1448 983 10432

2: 45 2 930 640 12002

Assume: Shortest Seek Time First does not also optimize latency, so requests in the same track are served FCFS

SCAN: 0.12 sec.

Request # Track # Sector # Seek time Latency Complete

3: 2863 3 1826 177 2003

1: 2222 1 1282 716 4001

4: 1234 4 1976 27 6004

5: 1234 3 0 999 7003

6: 1234 1 0 998 8001

7: 510 432 1448 983 10432

2: 45 2 930 640 12002

SCAN starts at one end of the disk. We assume it starts at track 2222, goes to track 2999, and starts scanning from there. As our book explains it, LOOK would go from 2222 to 2863, without first going to the end.

As long as the transfer rate is comfortably larger than the rate at which bits fly under the head, transfer rate has no effect on your answer.
As stated, Read/Write does not affect on your answer.

It is coincidental that SSTF and SCAN are the same in this example.

The details of these computations are subject to MANY variations. I do not expect your answers to agree with mine.

You should arrange the requests in the correct order.

SSTF and SCAN should beat FCFS

Time should have correct order of magnitude.

Legacy: BAD QUESTION. Students should be able to do this, but there are too many possible errors to grade sensibly.

Problem 3 Caching

What is caching, in the context of computer operating systems?
List four different contexts in which caching is a possible solution
What are the advantages of caching?
What are the disadvantages of caching?

Keeping a local copy of remote information.

Memory <--> disk, OS disk directories <--> on-disk disk directories, disk files, local disk files <--> remote disk files, domain name service, TLB <--> in-memory page table, browser or gateway can cache web pages

Speed, reduced bus/network traffic, allow asynchronous updates. Advantages means more than one.

Consistency, coherency, cost, complexity of hardware and OS, small size, slower on a miss than direct access. Disadvantages means more than one.

See chapters on memory, files, distributed systems. Chapter 1 has a summary. See problems 1.13 & 1.14.
-4 if all your examples are memory caching of disk, which I would consider one context.

Caching as an aspect of virtual memory or networking, but it is not the same. Caching does not provide a larger address space, for example, although it enables virtual memory to do so.

Problem 4 Extend XINU File System

The UltraTiny File System (UTFS) implemented for Project 10 could only store a maximum of 13 files on a disk.

Explain why the UTFS design was limited to 13 files.
Extend the UTFS design to allow an arbitrary number of files, provided sufficient disk space. Comment on tradeoffs made by your design and other limitations that would still remain.

UTFS provides for a single directory block, that can only store the number of file metadata structs that fit into a single disk block.
"Because Dr. Brylow designed it that way" is no answer.

Several possibilities: Extend the dirblock structure to include a pointer to another dirblock, allowing a linked list of arbitrarily many files in the directory, similarly to the free space list. In this solution, you still have only one directory.
Alternatively, allow file nodes to themselves point to another dirblock, which would implement a tree-like structure of an arbitrary number fixed-size subdirectories.
Important tradeoffs to consider include

search time to find a file name,

search time to find an unused filenode when creating a new file,

potential distance between filenode metadata and file contents,

overhead for synchronizing memory-cached dirblocks with the disk, and

difficulty reclaiming emptied dirblocks.

All things being equal, the extended UTFS system would still have fixed length file names and 256 byte max file sizes.
It is only half the story to say we need more space. The other half is to describe the bookkeeping necessary to keep track of it.

Problem 5 TCP/IP

List the four layers of the TCP/IP networking stack. Describe the function of each layer.

Hint: Answer as you might in a job interview. I expect 1 - 2 sentences about each layer.

DoD Four-Layer Model:

The Network Access Layer is responsible for delivering data over the particular hardware media in use. Different protocols are selected from this layer, depending on the type of physical network.

The Internet Layer is responsible for delivering data across a series of different physical networks that interconnect a source and destination machine. Routing protocols are most closely associated with this layer, as is the IP Protocol, the Internet's fundamental protocol.

The Host-to-Host Layer handles connection rendezvous, flow control, retransmission of lost data, and other generic data flow management. The mutually exclusive TCP and UDP protocols are this layer's most important members.

The Process Layer contains protocols that implement user-level functions, such as mail delivery, file transfer and remote login.

Problem 6 Networking

Most WANs employ only a partially connected topology. Why is this so?
What are two formidable problems that designers must solve to implement a network-transparent system?

Cost. A fully connected network requires a link between every node in the network, whose cost is O(n²). For a WAN, this may be too costly as communication links between physically distant hosts may be expensive.
Security is easier to maintain if your sub-network has few points of entry.
Cost is much more a barrier than security. Cost = O(n²), and n is roughly one billion.

Intentional vague question. What will you do with it?
One such issue is making all the processors and storage devices seem transparent across the network. In other words, the distributed system should appear as a centralized system to users. The Andrew file system and NFS provide this feature: the distributed file system appears to the user as a single file system but in reality it may be distributed across a network.
Another issue concerns the mobility of users. We want to allow users to connect to the "system" rather than to a specific machine (although in reality they may be logging in to a specific machine somewhere in the distributed system).
You might instead talk about file/directory naming and access, protection, consistency.

I accepted some answers quite different from what I had in mind. Other answers I judged to be in the details, although true, and missing the big picture.

I hope you had an fruitful and enjoyable semester, and I hope that you will keep in touch with me as your career proceeds, whether that is soon or in a year or two.

Have a pleasant summer.

Marquette University. Be The Difference.

Marquette | Corliss |