Marquette University logo      

Questions from the Cards
Week 1: Introduction to C

  Dr. Brylow

Photos from 2018's lectures, Intro to C.

 

Class processes:

Why is the class taught this way (co-taught)?

George:
a) Much stronger class for you
b) Less work for us
c) More enjoyable for us and (we hope) for you.
Basically, in our best professional judgement, Dennis and I think this gives you the best value for your tuition.

Are these lectures recorded and stored on the website somewhere?

No. We expect you to be here and to participate.

Can we go to different lab sections?

Perhaps.

There is a fixed number of seats in the lab, and the 10 a.m. and Noon sections are pretty full. If we get to standing room only, priority for computers and chairs will go to those actually registered for the hour. But attendance is not tracked, if that's what your asking, and in principle there is nothing wrong with occassionally attending a different section. This will be especially true once we break into teams, if both partners can make a single lab time, it is better to have you together than in separate sections.

Can you talk about the final exam in more detail?

See: .../~georgec/OpSys/examFinal19.html
See: .../~georgec/OpSys/moocsOnOperSys.html

Will you being in the lab at 9 P.M. on Monday every week be a review session of the week? Or more office hour type of thing?

An office hour type thing. But if there are lots of folks with the same problems or questions, there's a chalkboard in there that I'm not shy about using if we need to review or re-explain something complex.

Is it to our advantage to try to work ahead, or will it be helpful to work at the pace of our class?

In this particular semester, when we are fielding brand new versions of projects 3 through 8, there is some peril in getting too far ahead of the TAs. That said, I think there is a lot of advantage to starting the week-long projects on the first day they are assigned, which will generally be lab day on Fridays.

What is the best way to stay on top of this class?

Read the assigned readings until you really understand what they say and can solve the example exercies. Start right away on the projects, and plan to work on them a little bit each day, rather than planning for a marathon session the night before it is due. As a corollary to this, bring strong questions about the content and projects to class. Be a good project partner.

When will TA hours be posted?

TAs are still getting organized. (George and I still don't know who will be assigned as our grad TA.) I expect TA hours to be known by Wednesday.

How would I explain the importance of this course to a non-technical interviewer?

Well, you can always open with, "I built large sections of my own preemptive multitasking embedded operating system, including multicore concurrency primitives and priority scheduling." But you might be right that a non-technical interviewer might not grasp that. I really don't have any experience with non-technical interviewers. Is that a thing they're doing for students in our majors these days?

For non-technical folks, I think I would emphasize that your study of operating systems focused on the classical computer science theories that underlie virtually all modern computer platforms, but included hands-on experience building key components at the critical interface between raw hardware and systems software.

[George] Learned what really is going on behind the screens of our phones, tablets, computers, and embedded devices; met and overcame challenges, worked with a teammate; wrote and presented a skit (more on that in April).

Do we need to buy the textbook?

Yes

Introduction to C

Who is challenging Java is king of industrial coatings?

See TIOBE Index for January 2019

Is address arithmetic really supposed to be that confusing?

It is arguably not intended to be confusing, but it is powerful and arcane enough to tend naturally toward complexity.

How does memory allocation help C?

I'm going to take some liberty here, and assume that you're asking how it is helpful for C to control its memory allocation, unlike other languages, like Java, that largely take that power away from the programmer.

This is an ongoing debate that has been raging since before most of you were born. I'm not going to take an explicit side, but instead I'll share with you two opposing quotations that are relevant. The first is from our textbook preface:

"ANSI C retains basic philosophy that programmers know what they are doing; it only requires that they state their intentions explicitly." -- K & R, 2nd ed

The second is from a respected programming language designer from a very dissimilar language tradition:

"One thing I've noticed with C/C++ programmers, particularly (which is, again, the pool from which most C# programmers will be drawn), is that many of them are convinced that they can handle dangerous techniques which experience shows they can't handle. They say things such as, 'I like doing my own memory management, because it gives me more control,' but their code continually suffers from memory leaks and other pointer-related problems that show quite clearly that they are not to be trusted with these things that give them 'more control'. This, in my view, is just one more reason why 'unsafe' features should not be built into mass-market languages like C#." -- Craig Dickson, on the Haskell mailing list

What is a makefile?

A set of rules explaining how to build a program from a complex set of source files. [George]: We'll see our first example in Ch. 3

Why doesn't C have multiple returns?

I'm not aware of any widely used, mainstream programming languages that support multiple return values; it is usually considered too confusing for most programmers. The closest approximation I can think of would be the Ada programming language's use of 'out' parameters, but that's still not a true multiple return value.

[George]: If by "multiple returns" you mean "Can return more than one value," I'd argue structs allow exactly that. Matlab explicitly supports returning multiple values. F# supports return of tuples of arbitrary types.

How do other programming languages communicate with C?

Most practical programming languages come with features that allow them to interact directly with C language libraries. In nearly all scenarios, the glue code musters arguments and return values to match the C standard calling convention. In other words, other languages nearly always find a way to call C functions, or to provide a mechanism for C functions to call them. C remains the constant; everyone else accomodates C.

Lab issues

Do we pick partners, or are they assigned?

I will probably assign them.

Do you suggest installing GCC? For compiling on morbius?

Installing GCC is a great idea. It is free, and very flexible.

Is it necessary to use the Linux machines for the first few assignments?

Yes. You are welcome to develop and test code on other machines, but points will be assigned based solely on whether it compiles and runs correctly on our Linux machines, so you really *must* ensure that your programs work for us. We can't grade your work if it doesn't work on our computers.

How do I get putty and the command line stuff set up? I've never had to use it in a class before, so this is all new.

Putty is freely downloadable. Google "putty download".

How do I log into the MSCSnet network?
What is my login name and password for connecting to morbius?

See instructions George forwarded during class (and posted to D2L [Content]).

I need help setting up a computer for this class.

Stop by my office hours, and we'll see what we can do. However, resources for helping to setup other peoples' computers are finite. We make certain to provide functional, up-to-date Linux computers that are physically and remotely accessible to you 24/7 with all of the necessary software already installed and configured.

Can you share data on what the most popular more this connection times are to avoid being blocked four kicked out?

Don't think I followed that question.

How do I connect with putty to MSCSnet?

Type in the hostname "morbius.mscsnet.mu.edu", and use the login and password you discover following the instructions George forwarded (and posted to D2L [Content]).

Do we have to use terminal, or can we use an IDE?

IDEs are welcome. We have many installed on the Linux machines in the lab for your use.

Using an IDE on your own computer is more problematic -- you will have to learn how to extract your source code from this IDE, transfer it to the Linux systems, and you'll still have to know how to compile, run and submit the code through the Linux systems.

The advantage of learning the terminal is that it is standard mechanism that will work remotely from anywhere in the world -- even when you have too limited bandwidth for a fancy graphical interface. While initially more intimidating, most users quickly discover than the terminal can be faster and more powerful in an experienced users' hands for some tasks. I'd recommend investing the time to become at least reasonably familiar with the terminal. I use both IDEs and terminals intermingled freely, trying always to pick the best tool for the job at hand.

How do you copy files from your current machine to the lab machines through SSH?

For Windows machines, I recommend the freely available "WinSCP" tool. For Mac users, or home Linux users, you already have scp installed. An example command from the Mac terminal would be:

scp hello.c brylow@morbius.mscsnet.mu.edu:cosc3250/

which would copy the file 'hello.c' from my current machine to the folder 'cosc3250' on the remote machine Morbius. Most O/Ses are smart enough to find the source file location if you drag its GUI icon into the terminal window after you type "scp ". I have never used PuTTY before. Can you post installation instructions D2L?

Yes, I'll post an example YouTube video showing some basics.

How do you do a full directory copy in UNIX?

cp -R

where 'cp' is the copy command, and '-R' tells it to copy recursively.

Is TA-bot working?

I haven't set it up for automatic operation yet. Hopefully tonight.

I use a Mac, and I'm having some issues trying to run hello.C is seen in class through terminal.

Stop by for office hours.

Do we need to buy the raspberry pi?

No. We provide a pool of remotely accessible Pi hardware.

Best Regards, Dr. D

Why didn't the creators of C include Booleans?

They did not see the need. Nearly all programming languages represent the two Boolean values as integer values anyway, ANSI C just chooses not to make a big deal about the type boundaries, and does not predefine human-readable names for them.

The base type 'bool' was later added to C++, as well as the values 'true' and 'false', so a lot of others probably agreed with you. But, of course, in ANSI C, you can always do

        #define TRUE 1
        #define FALSE 0
at the top of any program, and use all of the Boolean value names you like.

What's our development workflow going to look like?

It will change over the course of the term.

The first few intro projects, you'll be able to compile your single-source programs directly with gcc. As we move to more complex code, we'll transition to using the 'make' command with a Makefile to define the build process. When we're cross-compiling our O/S kernel for the ARm platform, we'll use the 'arm-console' utility to automatically check out a Raspberry Pi board, power it up, transfer your new kernel over, and interact directly with it via serial port connection. We'll talk more about this later.

I understand Macros, but what do them being expanded mean?

That is a programming language term that means we replace the #defined keyword with the predetermined value. In the example two questions up, that means each occurrence of the keywork 'FALSE' in my program below the #define line would be replaced with the '0'.

What is #pragma once exactly and when should it be used?

The #pragma directive is used to control platform-specific behavior for the compiler. There are no examples where we will be using #pragmas in this course. But, for example, if you want to control whether a struct definition is stored as 'packed' or with each field aligned to a word-boundary in memory, that is typically a #pragma you can specify on many architectures.

What's the command to create a new file on morbius?

If you just want to create a new file,

        touch newfile

would create a brand new file called 'newfile', with no contents.

If you meant how do you edit a new file, I'd recommend with a text editor, like 'vim' (through the terminal) or one of the many fine graphical editors installed on the Linux workstations.

Are there any larger pros in using c?

Larger than, "it rules the world"?

Sure. C programs are notoriously well-optimized by compilers, and thus

are often hard to beat in raw execution speed. Sadly, because we'll be cross-compiling a raw O/S kernel for a different target architecture, we will normally have to turn off typical compiler optimizations in order to guarantee correct execution. (Trade-offs!)

Should we write our programs on morbius? Or do you suggest writing them on our computers first? Then running them on morbius?

Running them on Morbius or one of the other Linux workstations is absolutely mandatory, if you'd like t earn any points for your work. Where you write the program is much less relevant, but I would recommend familiarizing yourself with the basic process of writing them using command-line tools for these first few assignments. In later assignments, it is very unlikely that you will have the specialized tools installed on your own machines to compile or run our projects, and I would prefer you not get hung up on the mechanics of working with the Linux workstations when we get to the interesting stuff.

Can you explain SCP and how to do it?

SCP is the Secure Copy Protocol, and it uses encryption to safely copy files from one machine to another over a potentially hostile Internet. WinSCP provides a GUI for doing this, but the command-line 'scp' program available on nearly all Mac and Linux machines works like this:

        scp <source> <login>@<machine>:<destination>

I'm using the angle brackets above to mark off variables you need to fill in. A concrete example is:

        scp encoder-ring.c brylow@morbius.mscsnet.mu.edu:cosc3250/Project1/

which would prompt me for brylow's password on Morbius, and then copy the file encoder-ring.c from my computer to the cosc3250/Project1/ subdirectory on Morbius.

What should be read through by Friday’s class? Which chapter?
What chapter are we expected to have read?

K&R chapters 1-3 were recommended for this week.

What are good C compilers you would recommend?

GCC is what I always use. It is free, and quite good, although it still has its share of bugs, and the GCC developers sometimes take infuriating liberties with the C standards.

Can you go over passing arguments/parameters in C?

For now, calling other people's functions looks just like a method call in Java.

What is a general definition of macro?.

The computing definition I get from Google is, "a single instruction that expands automatically into a set of instructions to perform a particular task." The #define macros in C can indeed expand from a single 'word' into a whole series of instructions, including parameters. But for this week, we'll just talk about using them to represent numeric constants.

Will TA-bot be running every night?

That is the goal.

How does getChar work? Since a Char is a single variable, how do you use getChar to take in a string?

Sorry I didn't get to this question from Wednesday earlier, but I think we covered the first part of this today. The short answer is, "with a loop". The longer answer, which requires storing those characters into an array, we'll save for next week.

Do we have a turnin folder on d2l

No. You must use the electronic turnin system on the Linux machines. There is a demonstration video here: www.youtube.com/watch?v=z2jPUpfpZDg&feature=youtu.be

Since there will be no class on Monday, are office hours still available?

I'll plan to be in the lab or office at 9 P.M. on Monday.

In what circumstances are 32-bit machines better than 64-bit machines?

There are a lot of applications, particularly in embedded systems, where a smaller word size might save money on the processor, space in the data structures, and total space for the compiled code.

When using bitwise (&) instead of (&&) does it fall through true if the bitwise result is 1?

If the result of a bitwise operating is non-zero, that will be counted as "true" for most C compilers. Consider this snippet:

        int x = 1;
        int y = 2;
        if (x & y)    // is false
        if (x && y)   // is true

Is lab optional?

Yes.

Why not use stdint.h?

The stdint.h header was introduced in the 1999 revision to ANSI C, and provides more explicit names for many of the types. When you #include this, you can declare variables using types like

        int32_t x;

which builds right into the type name how many bits of storage are used. In practice, this is a great idea, and much preferred over the more ambiguous original platform-dependent type names, like 'int'. If you have a C99-compliant compiler, I recommend learning more about these.

Since this relatively recent innovation has still only been around for less than half the overall lifetime of the C language, I'm still waiting to see if these catch on properly.

Can you add more 'c' puns?

Hmm. Perhaps you could give me a few pointers?

How to deal with segmentation fault (core dump) errors?

GDB, the gnu debugger, is a great tool in the Linux/GCC toolchain for exploring these kinds of errors in your program. A demonstration is rather beyond the scope of this portion of our course, but I'd recommend learning more about GDB if you're going to be working more with C or C++ in the future.

If you aren't up for such high powered tools, segmentation faults in C programs are almost always caused by pointer errors in your code. For this first assignment, if you're already seeing seg faults, it is probably in your handling of the argv argument passing array. Check your code for uninitialized pointer variables, or array out of bounds problems in your loops.

How do I strive to be a power of 2 mastermind such a as yourself, Sir Brylow? Are there ways to boost learning process?

Twenty years of lecturing undergraduates on their powers of two certainly helps.

I think it is sufficient to be able to work through the first ten powers of two on your ten fingers: 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024. Once you've got those, any larger power of 2 can be decomposed into groups of 2^10 times whatever is left over.
      2^32 = 2^10 * 2^10 * 2^10 * 2^2. (Because 32 = 10 + 10 + 10 + 2.)
      The 2^2 = 4.
      The first 2^10 is ~about~ 1000. (Or 1K to computer scientists)
      Times the second 2^10 is ~about~ 1 million. (1M)
      Times the third 2^10 is ~about~ 1 billion. (1G)

Thus 2^32 is a little over about 4 billion, or 4 gigabytes. The trickiest bit is probably memorizing the prefixes beyond giga: 2^40 is tera, 2^50 is peta, 2^60 is an exa, then comes etta, zotta, bronto, etc. The difference between 1000 and 1024 starts to accumulate more the higher you go, but this is a pretty decent estimating trick for most tasks.

-Dr. D

How do operating systems run high-level languages?

We actually teach an entire course on that topic, COSC 4400 - Compiler Construction. Short answer -- powerful computer science tools transform high level language code into simpler forms that the machine can run, including hooking in any interfaces to the underlying O/S for I/O, memory management, process management, networking services, etc.

What are macros in a C file?

Shortcuts. The line "#include <stdio.h>" is a shortcut for copy/pasting all of the definitions from the standard I/O header file into your C program. Either option will generate the same program, but using the #include macro instead of copying all of the header file contents is far shorter and easier to read.

#include line with <...> vs. " . . " -- what it is the difference?

The line

      #include

tells the C preprocessor to look in the platform's "standard location" for the header file. On Linux systems, that usually starts in the directory /usr/include/, or some variant. Other platforms vary.

The line

      #include "project3.h"

would look for a project3.h header file in the directory where you are compiling the program. In other words, we use "" for header files that are local to the project, and usually written by the programmer. We use <> for header files that come with the O/S, and are installed as part of the system libraries. The two are generally not interchangeable.

How does the linker differ from the pre-processor?

The pre-processor does simple macro expansions in the .c source file. The linker is responsible for filling in final memory locations for any external resources referenced in the relocatable object code (".o") files. In the olden days, that included tracking down the pre-compiled machine code for referenced library functions, but now it is more common to link through platform-specific dynamic library mechanisms that both reduce the size of compiled binaries, and allow dynamic updates to the library implementations without having to recompile every executable binary file on the system. "Standalone" compilation is still available on systems where the dynamic libraries cannot be relied upon, or the code is being generated for execution on a different platform.

How does the linker find printf() if the program is already in machine code? Wouldn't that be easier to do in pre-processing?

The compiler produces so-called "relocatable object code" (.o) files, which is a fancy way of saying that the final memory addresses of all of the machine code and data aren't set in stone yet. (The linker does that.) As part of the process, the compiler leaves holes wherever there is a reference to an external function or data structure, along with meta information that will allow the linker to find the correct resource when building the final executable binary.

One can certainly envision a system that did much of this on the preprocessing end, instead. If I called a function like "printf()" in my source code, we could make a preprocessor that would look up the C source code for the implementation of the printf() function, glom that on to the top of my .c source file, and then run the whole thing, my code plus printf() code, through the compiler. If we make use of a lot of library functions, this could get expensive very quickly. Moreover, we'd be recompiling source code for basic, built-in operations again and again, slowing down our compile times and fattening our executable binaries substantially. Plus, some privileged operations, like directly talking to underlying hardware devices to accomplish output, can't be done directly by user code on multiuser O/S anyway. And any updates to the system -- say, a security patch for my O/S's printf() function -- would only take effect in newly compiled programs. Things actually used to work something like this on a lot of platforms, decades ago, before shared libraries and dynamic linking became widespread O/S features.

You can read more about these ideas at en.wikipedia.org/wiki/Library_(computing)

I remember when you compile the -o is output but what happens to the intermediary files (e.g., .S)?

Under normal circumstances, those files exist only temporarily during the compilation process, and are automatically cleaned up. Some of the special gcc flags just tell it not to clean them up.

Can you explain the use of ** in C?

A double star, like the "char **argv" in my demo yesterday, is merely a pointer to a pointer type, in this case, of type char. You can also think of it as an array of arrays of characters, or basically an array of strings.

We'll see more helpful examples of this later.

When do pointers come into play?

Next week.

Are there any other resources to help with ANSI C?

Tons. The first hit on Google is www.learn-c.org/, which I've heard other students recommend. But our K&R book is truly the best resource I know. Read it, think about it, solve the exercises at the end of each chapter, and you'll have all the C knowledge you'll need.

What languages will be used? Just to C?

C and a little ARM assembly.

What is vi for when opening the file? What is the difference with vim?

It is a widely-available, standard text editor on UNIX systems. See en.wikipedia.org/wiki/Vi. On most modern Linux systems, the version of vi installed is, in fact, a version of vim.

Will you go more in depth about the various commands within you next? Either in class or in lab?

Yes. Come to lab tomorrow.

For partner assignments, do we choose our partners?

We will probably allow you to pick partners, with constraints, such as, "your partner must be in the other course number."

What is the optional final?

Read the syllabus carefully.

Why do operating systems not need to use floating-point?

Dennis's answer: Floating-point operations are essential for efficiently computing many scientific and mathematical problems, but operating systems themselves have little use for non-integer numbers. Process and resource index numbers, timeslices, memory sizes -- all of these quantities are discrete, integer type data. An O/S should fully support application programs that need to use floating-point operations, but the O/S code itself has no need of them.

George’s story: MANY years ago, I was asked to teach a class that included a unit on (IBM 360) assembly language programming, and I needed an example for me to learn. I was giving my numerical analysis class an assignment to program Lagrange polynomial interpolation (in Fortran), so I wrote Lagrange polynomial interpolation in assembly. When I finished, I showed it to a senior faculty colleague generally viewed as a god of IBM assembly programming. He was completely baffled; he had never seen floating-point opcodes.

Why don't we change int sizes?

The short answer is reverse compatibility. So much software has been written in C, and changing how the compiler behaves -- even when it is technically fixing a widely-accepted bug -- might break a lot more than it fixes.

Should the int data type be avoided in regular use when you're expecting a four byte machine?

In embedded systems and other contexts where size is particularly relevant, it is not uncommon to typedef new names for the integer types that unambiguously note a size. It is now not uncommon to see types like "uint8_t" or "int64_t" in programs, which since the C++ 2011 standard revision are defined as unsigned 8-bit int and signed 64-bit int, respectively. This encodes into the source code a fixed width type that will not change if we compile the code on another platform.

At the same time, sometimes your code *needs* to change sizes based upon the platform. When we work on implementing malloc() and free() later, we'll see examples where the size of our data structure must scale according to the underlying word size of whatever machine we're on. C gets used for programming on both ends of this spectrum, so the basic int type remains the default choice for many programmers, and need not be avoided unless size constraints and platform portability are paramount.

Are function declarations made in a separate file? In Java, interfaces are in one file, and the methods are defined in separate files.

They can be either in C.

In a single source .c file, function declarations (prototypes) are usually put up at the top, before any of the function definitions start.

In a larger project, where C code in multiple source files might need to reference the same set of functions, we usually collect all of the prototypes together into a header .h file, which can then be #included at the top of all of the project's source files. In this way, the declarations are written in only one file, but are visible to the code in all of the source files.

If you have a function definition in your program, will you ever need a declaration for the same function?

Declarations are optional. If a function definition occurs before any use (call) of the function is seen, then no declaration is needed. In practice this is hard to accomplish reliably. Functions can call one another, and sorting your function definitions in the file such that definitions always occur before uses can be literally impossible in the case of circular call chains, and highly impractical in many other situations.

Declarations are the fix for this. A set of clear declarations at the top of the code, or #included in a shared header file, and you never have to worry about accidentally mentioning a function's name before the definition of declaration has been seen.

Can you explain prototypes in further detail?

I've said more about both in the previous couple of questions, but there are also very good explanations and examples in the textbook on pages 26, 29-30, 45, 72, 120, and 202.

When you compiled, you added -lm. Do you have to do this for every added library?

No, just the libraries that aren't considered part of the system's C library. This actually depends on the platform. I've worked on some platforms where the network socket layer functions were considered standard, and others where I had to put "-lnsl" on the gcc commandline to use those functions. On some platforms, the math library is considered part of the C library, but on our current Linux machines, "-lm" is necessary.

Appendix B in the textbook describes the 15 header files that are considered part of the ANSI standard C library, as well as giving an overview of what's in them. But I note that only math.h seems to require a special "-l" option to use on Linux. I'd hypothosize that this is because math operations can be greatly accelerated by certain types of co-processors, and Linux has adapted to a wide variety of processor platforms where these libraries may need to be more loosely coupled than the others. But, again, I seem to recall working on a pre-Linux platform where even the string.h functions required "-lstr" to be found.

The upshot is that the math library is really the only one you'll likely need to worry about this term, but when you use C out in the real world, you'll quickly see other libraries you may want to link to for graphics, networking, database access, etc.

When using macros to define values, does the compiler auto-assign this type to allocate space? What happens?

Macros themselves don't take space as far as the compiler is concerned. The preprocessor does all of the macro substitutions before the compiler runs, and then the compiler evaluates the resulting code with the macros removed.

On calculators, when is there an overflow?

Very platform-dependent. Calculator manufacturers determine this through both their hardware and software choices.

What happens if there are more C function arguments than available ARM registers?

The compiler reassigns them to the stack.

About the homework: how should we store the numbers parsed from the user input if we should not use arrays?

I store them as a single integer that I build up along the way. If the input is "543", I first see the '5' character, and store integer 5. Next I see the '4' character, and update my total to 54. Then I see the '3' character, and update by total to 543. This is called "Horner's Method". You can see it at work in the getint() function on p.97 in the textbook.

Historical note: Horner invented Horner's Rule as an algorithm for evaluating high-degree polynomials with complexity O(n), vs. O(n2) complexity implied by the standard notation for polynomials. Although OS's rarely use floating-point computations, we still owe intellectual debts.

Can you turn your homework in multiple times? If so, something doesn't pass testing and the homework hasn't reached the due date, can you fix the problem and resubmit?

Yes, and yes.

How do we know if a user is using octal base when entering an equation for homework one?

Octal numbers start with a leading '0', and continue with a digit between '0' and '7'.

Would you suggest learning vi? Or using another text editor?

I suggest learning vi. It is universally available on UNIX-like systems, and is considered to be a favorite tool among power users. I use it for most file editing that doesn't require fancy typesetting.

Why don't they create special operating systems that handle larger byte sizes?

Byte size isn't really a function of the O/S. Processor hardware sets the stage for byte and word size, and programming languages and compilers determine mostly how numbers are handled by our programs.

What do the exams look like for this class? Will there be topics given as a guideline to study?

George says: You have 10+ years worth of exams and solutions for study. See the page for this semester’s Exam 1 (Feb. 21).

1) What universities have better architectures/systems?

There are several top-notch universities with large doctoral programs in computer science that include multiple, graduate-level courses in Operating Systems and advanced aspects of Operating Systems. These institutions need to support not just the basic O/S assignments like we do, but also more advanced projects, like virtual memory management, privilege level and system call mechanisms, internetworking systems, etc. By definition, some of these institutions are better equipped than Marquette. However, those laboratories also cost hundreds of thousands of dollars more to build. Our facilities cannot compete with MIT.

That said, even schools like Purdue, Indiana, Ole Miss, and SUNY Buffalo rely to some extent on hardware and software we built here at Marquette to teach their O/S courses. So there.

2) When will partners be used for assignments?

Beginning in Project 3.

3) Can you speak more in depth about the online course option on the syllabus?

No. I think the syllabus is pretty clear on this. E-mail Dr. Corliss if you have more specific questions. He is keeper of the syllabus.

Corliss: Also see more details at www.mscs.mu.edu/~georgec/OpSys/moocsOnOperSys.html

Corliss: The class web site is a rich source of course-related information, with many cross-links.

4) What are unions?

We'll talk about that in class later. If you want to read ahead, see K&R Section 6.8, "Unions".

5) How are unions related to Java's linked list?

Not really related in any obvious way I can think of.

6) What is address arithmetic?

We'll talk about that in class later. If you want to read ahead, see K&R Section 5.4, "Address Arithmetic".

Dr. Brylow

7) If I have a ton of Java experience, should I be alright in this course?

Yes. Most students in the course fall into this category.

8) I have a lot of experience with C. How long is the C tutorial phase?

I think we'll blow through seven chapters of K&R in the next three lectures, and then start in on the O/S book next Friday. We'll review more advanced C concepts as they resurface later in the course.

9) How much writing will be required?

There may be a couple of assignments where we expect you to write coherent English, but not more than a couple of pages. Bonus and extra credit points may entail additional writing.

10) Do we need the book?

I need the K&R book, and I refer to it frequently as a programmer. It is particularly useful when I want to look up concise examples of using complex language features, like variadic arguments or function pointers.

Do you need the K&R book? That's a more prosaic question. I once saw a student Google a common C question in lab, even while the K&R book was sitting right next to her on the desk, *and* as I was telling her what page the answer could be found on. She then proceeded to wade through three pages of confused and incorrect answers before settling on an example that would have worked in C++, but wasn't legal in ANSI C. So I suppose the answer depends on how high your tolerance is for paddling about on an ocean of ignorant and misleading quasi-answers when you could instead be consulting the most concise, authoritative and elegant language reference book in the entire computer science canon.

As for the Dinosaur Book, the structure of the course follows the contents pretty closely, and it is essential that you have access to someone's copy when you are studying for the exams, or trying to implement one of the exercises that we base a programming project upon.

Corliss wise-ass answer: Of course not. We'll be happy to see you again next Spring, and accept your tuition again.

Corliss: Yes, you absolutely need both textbooks. We assume you have read the text with some care. As a check, at .../examMid17A.html, you find links to 10 years of exams in this course. With a little exploration, you will find questions on textbook material beyond what is covered in class.

11) I forgot my MSCS password.

I do not know what it is. But you can follow the instructions at
      www.marquette.edu/mscs/resources-account.shtml#forget_password
or drop by to see the MSCS Systems Administrator, Steven Goodman, in CU 374.

12) Why is C a "small" language?

It comes out of an era when computer scientists strove to design elegant, efficient software because computers were so much less powerful. By the standards of that era, Java is a sinfully bloated behemoth of excessively wasteful carelessness.

But really what I mean is that C is simpler syntactically -- it has a smaller number of more powerful reserved words and constructs. You can compare the relative sizes of the language grammars here:

13) At what point will PuTTY be on the instructor computer images so you don't have to download it every class?

Presumably at the point that the set of faculty routinely doing in-class demonstrations on remote Linux servers becomes larger than one.

14) What type of file organization does C have?

We'll talk about that in class tomorrow.

15) Does C++ have garbage collection?

No, C++ also does not feature garbage collection. It is difficult for dialects of C to feature garbage collection because the expressive power of the language makes it easy to compose programs that resist correct reference counting and other garbage collection analyses.

16) What is a weak type?

Good explanation at en.wikipedia.org/wiki/Strong_and_weak_typing.

The bottom line is that while modern C compilers do offer reasonable static type checking, there is no type safety or memory safety, and the language allows the programmer to cast anything to a "void *" type -- basically a trapdoor around any future type safety features that might be added.

Questions from Friday, Jan. 20, 2017:

Dr. Brylow - Strings are null terminated

Answered orally in class on Monday, Jan. 23.

Questions from Wednesday, Jan. 20, 2016:

1) Will we go over the UNIX commands in lab?

Yes. In addition, there are links to tutorials and other resources in the Notes section of the homework specification.

2) Will we learn about #define?

Yes. Possibly Friday.

3) What does the "-o" do in "gcc -o hello hello.c"?

The "-o" option tells the compiler what file name to use for storing the executable program when compilation is complete. In this case, I asked it to build a program called, "hello".

The "-o" option is... well, optional. If I don't use it, as in "gcc hello.c", the compiler will choose a default executable name. On our Linux systems, that default is "a.out". Since that is not a very helpful name, I almost always use the "-o" option to specify a resulting executable file.

Warning: I have sometimes mistakenly typed something like, "gcc -o hello.c hello.c". This will happily compile your program, and then overwrite your source code file with the executable. That is almost never what you meant. If you cannot be precise with the "-o" option, don't use it; the gcc compiler does not politely ask if you would care to obliterate the target file name.

4) Can we build a file to contain useful helpers like string methods and include it for all projects?

Yes, and in a sense, that is what the standard C string library functions are for. (See "#include ") However, they are just helper functions for common operations, like combining, copying and comparing strings. It is not the same as having built-in language support for syntax shortcuts like automatic concatenation with the "+" operator in Java, or high-level representational features like string immutability, boundary checking, and automatic allocation.

5) Is there a language that combines the low-level power of C with the features of Java?

I think Bjarne Stroustrup would claim that C++ is that language. But the last time I saw Stroustrup in person, it was in a room full of angry computer scientists who had come to his talk mostly hoping to hear him apologize for the unholy abomination that is C++. They were disappointed; he seemed to be having a great time.

There are many programming languages that have tried to fill that gigantic abstraction gap between C and Java; it is apparently a very hard problem.

6) How will not having prior knowledge of C hurt me? How much will prior Java knowledge contribute to this class?

At least three different groups of majors are coming to this course from different prior experiences. None of them, in general, have what I would consider to be extensive prior knowledge of C. Prior knowledge of Java will help because so many of the basic reserved words and constructs are similar between the two languages.

7) How many reserved words are there in Java and C? How many are shared?

C has 32 reserved words. Java has 50. Nearly all of the reserved words in C are also reserved words in Java; the exceptions I can think of are:

signed register sizeof typedef union extern

But there are also a few, such as "static", that mean different things in the two languages.

8) Which one is replacing the parts on the car, and which one is building new parts?

In the sense that so many other programming languages have first been implemented with C, and that modern languages like Java often have enormous collections of pre-built libraries, I claim that C allows for building new parts from first principles, whereas programming higher level languages often be reduced to looking up which existing part, library or object needs to be plugged in to fix the problem.

9) What are embedded systems?

Embedded Systems are computer devices that are built into some other, larger system. Unlike general purpose computer systems, like laptops, PCs and server machines, embedded systems tend to be dedicated to a special purpose, and have more limited resources and/or closer interactions with the physical world. Examples include the computerized brains inside of medical devices, vehicle systems, home appliances, and a dizzying variety of consumer products.

10) Do you like C?

I respect C, and I like what I and my students can accomplish with C.

11) Were all the C puns deliberate?

You mean like:
"In the place of a Dark Lord you would have a Queen!
Not dark, but beautiful and terrible as the Morn!
Treacherous as the C! Stronger than the foundations of the Earth!
All shall love me and despair!"

No. Totally unintentional.

-- Dr. D

Questions from Friday, Jan. 22, 2016:

1) What is a "macro"?

Wikipedia states that, "a 'macroinstruction' is a rule or pattern that specifies how a certain input sequence should be mapped to a replacement output sequence." Macros look a lot like functions at first glance, with simple #defines being functions that take no arguments and always return the same thing. But macros are subtley different from general purpose functions, because they operate primarily on textual substitution. They are thus limited in their complexity, and cannot define recursive functions, for example.

2) Is there a certain C compiler we should be using for the class?

Yes. You should be using the GCC toolchains installed on the Systems Lab Linux machines. This will not matter much for these earlier assignments, but once we move to the ARM cross-compilers, it will matter more.

3) What is the point of putting an extra "precision" in "gcc -o precision precision.c"?

The "-o precision" part tells GCC that I want it to build an executable program called, "precision". If I don't put this, it will choose a default name for my program, usually "a.out".

4) Can you give an example of #ifdef DEBUG?

c.learncodethehardway.org/book/ex20.html

5) Is there a good list of printf() format specifiers?

Try "man 3 printf" on any of the Linux boxes, or see Table B-1 in Appendix B of your K&R textbook.

6) Do we need to save command line are as an integer? And how? (HW1)

Yes. See K&R textbook Section 5.10 for accessing commandline args. See your textbook appendix for library function atoi(), or try "man atoi" on the Linux boxes.

7) If you get stuck in a loop with #includes, how do you get out?

I tried it just now, and it recursively tried to expand my looping #include several thousand times before it gave up with an error, "error: #include nested too deeply". I guess it has some protection built in now.

8) At what point should we have read all of the C textbook?

End of the week (#2).

9) In man page for atoi(), it took something like atoi(char str *restricted). What does the "restricted" part mean?

My man page for atoi() has:

    int atoi(const char *nptr);

The "const" is a promise to the programmer that atoi() won't destroy the string you are passing in as a parameter. ("Why?", you may ask. "Is that a thing?" Yes. There are other functions in the C library that alter your original arguments as part of their job.)

10) Why are there dinosaurs on the cover of our O/S textbook?

galvin.info/2007/03/13/history-of-the-operating-system-concepts-textbooks

4) How do I get rid of my "rm -rf" habit?

Only time and patience can help you free yourself of an "rm -rf" habit. For those of you who don't know, "rm" is the UNIX command for removing files. The "-r" makes it recursive, so that it also removes all of the subdirectories and files underneath a directory. And the "-f" is "force", which tells rm to delete the files without asking if you are sure.

The "rm -rf" addict is unable to prevent themselves from periodically deleting all of their files, usually minutes before an assignment deadline. Enthralled by the power of such merciless destruction of bits, some escalate their problem to "rm -rf ~/", which deletes their entire home directory, or the extreme "rm -rf /", which when executed as a superuser deletes the entire contents of a filesystem and renders a UNIX machine unbootable. This affliction is related to syndromes causing uncontrollable toothbrushing in the middle of meals, and an unhealthy attraction to narwhals.

Compulsory version control software can often help, as well as the assignment of project partners with sharp eyes and lightening-fast reflexes. The most extreme cases must be chained to specially-adapted versions of Mac OS 9 that pop up triple-tiered "Are you SURE?" dialog boxes for each individual deletion, and administer electric shocks through the mouse whenever you click "OK".

5) When does the narwhal bacon?

At midnight. TA-Bot bacons at 3 a.m.

Questions from Monday, Jan. 12, 2015:

1. Is there another lightweight language that is used in operating systems?

Objective C (which is not as similar to C as it might sound) has been used extensively in Apple operating systems, and C++ (which is awfully similar to C for most programming tasks) continues to be the most widely used language for O/S development.

There are many special-purpose, niche langauges and many obscure, niche operating systems out there, but I am not aware of any lightweight languages that have been used, for example, to build more than one O/S.

2. When should we have completed Chapters 1-3? Complete chapter 1 for Wednesday, 2 for Friday, and 3 for next Wednesday.

3. Is Assignment #1 a team assignment?

No. Line three of the assignment specification states, "This assignment is to be completed individually."

4. When will teams be created/assigned?

When project 2 is assigned.

5. Can a structure be in a structure?

Yes.

Questions from Monday, Jan. 13, 2014:

1) Java is used in a lot of front-end development to replace CICS screens; does C fit into this at all?

If you are referring to the IBM transaction system CICS, then I doubt C is being used to modernize those front-end user interfaces. GUI development is possible in C, and quite common in descendant languages C++, C# and Objective C. But these newer C derivatives and Java would offer a lot of advantages in GUI programming over traditional C.

2) What does the "-o" command switch for gcc mean?

In the class demo, where I used

        gcc -o hello hello.c

the "-o hello" part tells gcc that the compiled, executable program I want to be produced should be called "hello". Unlike in Java, where the name of the class defined in a .java file indicates what the ultimate .class file should be called, C carries no such information in its source files.

If I omit the "-o" option, gcc will pick a name for me, and the file will be called "a.out".

Take some care with the "-o" option. An easily made beginner's mistake is to say

        gcc -o hello.c hello.c

This would tell gcc to overwrite my source code file, hello.c, with the compiled program.

3) How does the additional compilations steps in C affect its overall completion time compared to a Java program?

The Java compiler is actually preforming very similar steps to the C compiler, but in Java they are more automated, and the individual steps are not available as separate tools. Because C is used to build programs in a lot more complex ways than Java, it is necessary to have the compiler, assembler, linker, etc., available as individual stages. For example, when we get to building our embedded O/S in a few weeks, there will come a point when we need to incorporate assembly language files into out system image. The C compiler allows us to do this easily because we can instruct the build system to individually compile the .c files only to the object code step, and then include additional assembly functions at the linking stage. Do that with your fancy Java compiler!

But to get back to your question, actual compile time varies quite a bit with both languages, and depends on much more than just the number of steps involved.

4) What are the programming languages that are C's main competition?

That depends on the sphere of programming we're talking about. For general systems programming, nearly all of C's competitors are its own descendant languages. In embedded systems, languages like Ada, Esterel, and FORTH all have dedicated communities. In networking, Java has a fair amount of traction.

5) Can we use Microsoft Visual Studio?

Short answer: No.

Longer, more nuanced answer: You're welcome to try, but your submissions must compile and run correctly on our Linux machines for you to earn points. "But, it worked on *my* computer," will not be taken as a valid excuse.

Because C is such a low-level language, many basic features are platform dependent, by which I mean, they depend on the processor architecture and operating system. Even basic features like I/O may differ between Unix and Windows -- the function names that you should call, and how those functions behave. This is a big part of why it is not trivial to port any old program that is written for Windows over to Unix, or vice-versa.

Free software exists (see Cygwin) for allowing your Windows platform to look more like a Unix platform from a programmers perspective. However, correctly installing and using that software is well beyond the scope of this course. Ultimately, you'll still need to check that your work compiles and runs on our platform.

Moreover, once we move on to the embedded O/S assignments, you will need access to a MIPS-architecture cross-compiler, because our O/S will run on MIPS hardware, not Pentium processors. I have not seen anyone use Visual Studio to do this sort of cross-platform development correctly.

6) Are C compilers made in assembly?

Nowadays, C compilers are written in C, and compiled with earlier versions of the compiler. I think the question you are getting at is, "How do you bootstrap a programming language compiler"? Some compilers are deliberately written in a different language than the one they implement, but it is considered a rite of passage for most compiler writers to get their compiler working in their own language. Frequently this is done iteratively, starting with a small kernel of the language, (perhaps compiled in another language, or even written in assembly,) and then working your way up in small increments toward the full language compiler.

When we build the MIPS-architecture cross-compilers used in the lab, we start with downloading the complete source code for a new version of gcc, written mostly in C with a little bit of assembly. We use the built-in version of Intel-flavored gcc already present on the Linux machine to build a "bootstrap" version of the MIPS-flavored gcc, a very simple version of the compiler that is then able to compile all the rest of the C code into the full-blown version.

7) For those of you that have read this far, I include the following optional reading for your entertainment:

research.microsoft.com/en-us/people/mickens/thenightwatch.pdf

It was sent to me yesterday by a recent graduate who now works as a systems developer for one of the world's best-known producers of commercial software. When I told him that I was considering passing the link on to this O/S class, he said:

"Do it. I have a very very real appreciation for that class now that I am actually dealing with the internals of a real world OS. All the concepts are right here...."

-Dr. D

Questions from Wednesday, Jan. 15, 2014:

1) Is gcc built into Terminal? and
2) What is the vi command in Putty?

The Unix Terminal, like the OS X Terminal, and the Windows cmd.exe command-line window, is merely an interface -- a text-based mechanism that allows a user to invoke programs stored elsewhere on the filesystem of the machine. Putty and ssh are mechanisms that allow a user to login to a terminal on a remote machine across the network.

Thus, when logged in via Putty, the vi command is still the vi command. Gcc is not built into Terminal. Gcc and Terminal are two separate applications that happen to both be installed on all of the machines in the lab. It is possible to install Linux without gcc, or without Terminal -- but such a machine would be useless to us in a course where you build systems-level software.

3) What is the difference between "\n" and "\r"?

One has ASCII value 10, the other has ASCII value 13. In old-fashioned teletype terminals, you needed both to start a new line. The "\r" carriage return would return the carriage (the typewriter head) to the farthest left edge of the page. The "\n" newline advanced to the next line. If you had only the "\r", your teletype would return to the beginning of the line, and begin overwritting characters you had previously typed. If you had only the "\n", the teletype would move down one line, but continue typing in the same column you were in.

Modern platforms do not require both "\r\n" to start a fresh line on the far left, but every time you hit the enter key in an editor on Windows, it still puts both ASCII characters into the file. Unix and OS X, on the other hand, usually only put a "\n" in the file.

When we begin developing our embedded O/S, we'll connect to the embedded processor using an old-fashioned serial link, and it will prove to be important to use "\r\n" at the end of each line. But for now, (and for most Unix programs in general,) only "\n" is required.

4) Is there a version of if-else statements in C?

Definitely. See the section on conditionals in your textbook for examples.

5) How do you implement Unicode symbols in C?

Since Unicode really came into vogue well after the C language was established, C does not have a built-in mechanism for dealing with character codes that are bigger than one byte. So in C, we use multiple chars to represent a UTF-8 encoded Unicode character. I'll talk more about this in class and lab tomorrow.

6) What did limits.h do?

By #including limits.h, a C program gains access to a number of #defined constants, such as CHAR_MIN (-128) and CHAR_MAX (127) that can be helpful when programming defensively against unintentional overflow and underflow.

7) Can you post code examples after class?

Yes. Thanks for the reminder. Lectures demos can be found in the directory ~brylow/cosc3250/demo/ on the Lab machines.

8) Is there a particular reason why '%' is used as the delimiter for printf(), or is it arbitrary.

I'd guess it is arbitrary. It doesn't signify anything special, and it is not like there is never any call to print an actual percent sign in strings. (To print a percent sign, you use the format specifier "%%", by the way.)

9) What would you recommend to edit my C programs?

I usually edit my C programs with vi, emacs, or Eclipse. Vi is great for fast editing of single files. Emacs excels at automatic formatting of bigger files, and more advanced options like searching, replacing, spell-checking, etc. Eclipse is a proper integrated development environment for large projects with multiple related files.

10) I've gotten lots more questions of the form, "Can't I use <insert some other tool here> instead of gcc/vi/Unix?"

To reiterate, I don't care what editor or compiler or O/S you use, as long as you can do the work correctly, and it isn't actively making you less effective. I think there is important pedogogical value to everybody learning the Unix and gcc development model, both because it is a prevalent and successful system you may be expected to be familiar with when you leave here, and because you need to be able to adapt quickly to unfamiliar tools and environments in the real world, too.

However, if you are determined to avoid becoming proficient in Unix-land, you're probably going to find it much harder to write an embedded operating system when we get to Project 3. Also, our measure of "correct", even on these early intro C assignments, is that your submitted code compile and run correctly on Morbius. Many of you who have grown up in happy, platform-independent Java world seem to have trouble believeing me that it is going to be difficult for you to develop correct C code on some other O/S platform, and then get it to run properly on a highly dissimilar Linux machine. The same will go for building our embedded O/S, which requires very specialized tools to compile your code for a completely different processor architecture.

If you are already highly proficent at low-level C and assembly programming, have built your own cross-compilers, can develop a load-linking map for a foreign memory layout, have mastered some other automated build/make tool, and routinely debug ELF-encoded executables, by all means, apply that knowledge for this course and have a good time. If you haven't, or if you aren't even sure what all of those terms in the previous sentence mean, then I suggest you get up to speed on the tools I am offering -- because I've already done all that work so that you won't have to.

Corliss: You cannot be proficient in too many tools. A recent survey of salaries paid to Data Scientists showed a strong positive corrolation between salary and number of tools in which one is proficient. Also, because of their histories, Unix tools tend to be more industrial strength. Over winter break, I had a need to work with a *.CSV file with 23M records. Excel couldn't take it. Word couldn't take it. I tried three other Windows editors. No can do. Vi handled it just fine. Loaded in 5-10 seconds; global search and replace (using regular expressions, of course) of 23M records in less than a minute. Proficiency with Unix tools can make you the tech guru of your team, often with salary advantages.

-Dr. D

Questions from Friday, Jan. 17, 2014:

1) Can you do: if (int x = getchar())?

This question can be answered by typing this in and trying it. The answer is "No."

2) How should we end our program? Should we ask the user if s/he wants to quit after every iteration.

This question can be answered by observing the reference implementation that I have provided. Then answer to the second question is "No." You should definitely not prompt the user to quit after every iteration; if you do so, you will fail every single testcase I'm using for grading. Instead, you should end your program when a call to getchar() returns the end-of-file marker, EOF. Therefore, your program should vigilantly check the results of the getchar() call.

3) Should you declare "int x;" and then use x in a for-loop and set equal to value or initialize with value in declaration before loop?

For the sake of readability and reliability, I would tend towards clearly initializing any loop counters at the start of the for-loop.

4) Are there objects in C?

No. However, there are structs, which can be defined to contain both fields (data) and function pointers (behavior). But there are no classes, no inheritance, no protection modifiers, and no interfaces. The object-oriented features in C++ and Java grew, in part, out of a response to common practices in structuring large programs using C structs. The designer of C++ wanted to add language features to make it easier and safer to do exactly what he was already seeing in large C programs.

Corliss: Dr. Brylow surely answered correctly the question you intended. Let me answer the question you MIGHT consider. To me, "object-oriented" is in the mind of the designer/developer; it is not about dots in identifiers. The most beautiful object-oriented code I have ever seen was written in the late 1960's in an early dialect of Fortran, decades before the term "object-oriented" was coined. The programmers CLEARLY were THINKING about objects, their attributes/properties, and methods/function that act on them. Their programming language, like C, does not make it easy to express their object-oriented design in pretty code. Actually, that's what Dr. Brylow said in his last sentence.

5) Why is the feature included in C that allows any expression to become a statement? Example -- "42;" -- Why is this ever useful?

It probably isn't useful for short expressions. It becomes more useful with complex expressions, for example, when calling functions. Earlier versions of C did not include a "void" return type -- every function by default returned an integer. If you wanted to call a function that just printed something out in a particular way, but returned nothing useful, you wouldn't want to have to declare a fresh variable and assign it with the function call, just so that you could call the function. So, in C, any function can be called on a line by itself. The rest is probably just a simplifying generalization.

6) When would you use a bitwise operator instead of a logical operator?

You will use bitwise operators in Project 3, when you are testing for individual bit flags in a device driver that talks to the serial port hardware. In general, bitwise operators are needed when you want to manipulate individual bits in data, in memory, in images, etc. Bitwise operators are also useful for managing bit flags -- a technique for storing up to 8 individual Boolean values in a single byte. Logical operators are only useful when you are calculating Boolean conditions, like whether x is true AND y is true.

7) What's the difference between "&" and "&&"?

The bitwise-AND ("&") is an operator that calculates an answer by ANDing together each bit-position of two integer operands. The result can have many zeros and ones, depending on what the operands were.

The logical-AND ("&&") is an operator that calculates a Boolean true or false, based upon the total values of its two operands. The operands are also treated like Boolean true and false, even though "true" means all integer values that are not zero.

Additional examples and explanation on page 48 of your C book.

Corliss: Writ some test cases and see for yourself.

8) How do we enter the Euro, Yen, etc.?

The easiest way to enter one of the foriegn currency symbols in is to simply copy/paste it from somewhere else, like the project webpage that has all of them listed.

The second easiest way is to put them into a textfile with an editor. In vi, typing Ctrl-V u 20ac will generate a Euro symbol, and then you can use input redirection to use that file as test input.

Finally, depending on what platform you are connecting from, there are usually O/S-specific rules for generating these special symbols from the keyboard. If you're actually sitting in front of a Linux machine in the lab, you can type Ctrl-U and the hex code (20ac) to get the Euro symbol. On my Mac laptop, Option+shift+2 generates the Euro sign. Check out the "Entry methods" item on the Wikipedia page (en.wikipedia.org/wiki/Euro_sign#Entry_methods) for other examples.

9) Would scanf() work just as well as getchar() to put an input value for the character in the variable?

No. Scanf() is not a very useful function in any context where the input might be something other than what is expected. If you tell scanf() you want an integer, it will hang until somebody types in an integer. It cannot be used to detect errors as they are happening, it cannot put characters back if they belong to the next quantity, and it will not properly end your program when end-of-file is reached.

The sscanf() variant is more useful, once you have buffered up a line of input and can attempt several possible conversions. However, a lot of the topics involved with properly using sscanf() in this way are way beyond the intended difficulty level of this project. I advise steering well clear of the scanf() family of functions for this entire semester.

10) Is there a difference from ++x and x++?

Used in isolation, no. Used as part of any larger expression, yes. One increments the x before the overall expression is evaluated, and the other increments after.

Corliss: To help remember, the authors of C++ named their language "wrong." The expression "C++" takes C, improves it, and returns the OLD thing. C++ perhaps should have been called "++C," to return the IMPROVED thing. <My tongue is in my cheek, here.>

Questions from 1/16/2013 cards:

1) What's "two's complement"?

Two's Complement is what we call the most common representation we use for signed binary numbers, although technically it refers only to a mathematical operation we use as a calculation in the representation.

To review, binary is the number system we use on computers, because we have good technological solutions for representing electrical zeros and ones and the operations we perform on them. So each column in a binary number represents a particular power of two, and my adding them together we can represent any positive number.

So how should we represent negative numbers? One easy method would be to take one of the bit positions in our binary number -- say the leftmost bit -- and make it represent a plus or minus sign, instead of a power of two. This representation, called "signed-magnitude", was used in some early computers, but it suffers from several drawbacks, including the existence of a "negative zero" number that is a different bit pattern from "positive zero." The same goes for a similar representation called "one's complement."

GC: Current FLOATING POINT hardware representation has +0 != -0, which can lead to all manner of unpleasant surprises.

The two's complement representation is more complicated to implement, but it has the distinct advantage of using the same arithmetic hardware for adding positive and negative numbers as for unsigned binary arithmetic. It's drawback is the asymmetric range of representation, which always causes our range of numbers to have one more negative number than the positives.

For more detail, see en.wikipedia.org/wiki/Two's_complement or the textbook from COSC 2200 or EECE 2030.

2) Why did you put a '$' in the command 'echo $PATH' during your demo on the computer?

The UNIX shell maintains a table of variables called an 'environment'. You can display your current environment in UNIX with the command 'env'. Amongst the variables included there is your PATH, the default list of directories that the shell should look in for commands when you enter them. The shell provides commands for setting, viewing and changing environment variables, but the syntax entails using a '$' in front of those environment variables.

You can read more about this in the 'Parameter Expansion' section of the bash shell manpage, using 'man bash'.

3) Is there no GUI on UNIX machines?

There is definitely a GUI on UNIX machines, as you will see when you attend this week's lab session. The GUI system on our modern Linux machines is descended from a system called X-Windows, which was already in use on UNIX machines at universities when the first version of MicroSoft Windows was released for PCs.

However, while it is easy to use the UNIX GUI when you are sitting in front of a UNIX machine, it is slower and more difficult to use a UNIX GUI remotely when you are sitting in front of a Windows machine, like the computer that displays on the screen in the classroom. Special software needs to be installed and configured; we have this software installed on all of the Windows machines in the labs over in Cudahy, but it is not widely available on classroom machines on campus.

It is much easier for me to demonstrate things in class using the text-based remote interface for UNIX, and I can accomplish all of the same tasks, often more quickly. As a side bonus, if you are also able to become comfortable doing things using the command-line UNIX interface, you won't always have to trudge through the cold to the Systems Lab to do your homework this semester.

GC: One day while I was in Tanzania, the connection to Marquette had a latency of roughly 10-15 seconds and a rate of about one character every 1-2 seconds. I could type a Unix command, wait a bit, and the characters would appear one at a time, every couple seconds. Don't try working with a GUI at that rate. Actually, X-Windows is a low network traffic GUI because the remote machine is pretty sophisticated, so not much graphical information is sent across the network.

4) Are there other parameters for GCC that might effect our programs?

There are many, many more parameters for GCC that can have important effects. For example, the -O parameters change the level of compiler optimizations, and can alter the speed, size, and correctness(!) of your program. We don't have time to go through even a fraction of what is available. You can get an idea of what other GCC options can do by looking through the GCC man page, with 'man gcc'.

5) Why didn't they follow the K&R book when deciding int sizes on 64-bit machines?

My interpretation is that many, many programmers in the 1990s wrote C programs that depended upon type int being 4 bytes, even though K&R had warned them not to, because they had watched this same transition play out on a smaller scale when the community moved en masse from 16-bit to 32-bit processors. As 64-bit machines started to appear, some major C compiler writers reasoned that going with K&R's decision would break a solid majority of existing programs when ported to 64-bit machines, and would put us in the slightly uncomfortable position of having a 'long int' be actually shorter than a plain 'int'. So they kept 'int' at 4 bytes and let 'long' grow to 8 bytes, in direct contravention of K&R's design.

By the time the controversy grew to a community-wide argument, compiler writing practice had solidified around this position, and it was too late to go back and change everybody's programs and compilers.

6) Can you explain "overflow" again?

Variables in C are allocated a fixed number of bytes for storage, based upon their declared type at compile time. A variable that is of type 'char' gets one byte of space, and that byte can be used to store 256 values. A signed char (which is the default) will store values from -128 to 127. An 'unsigned char' will store values from 0 to 255. In either case, if I perform a mathematical operation that would result in an answer that cannot be represented in 8 bit, two's complement, C does not produce an error message or detect the condition. Instead, it truncates the answer to 8 bits, and moves on. We call this "overflow" or "underflow", depending on which way it goes. Consider the sample program below:
#include <stdio.h>

int main()
{
	char x = 127;

	x = x + 1;

	if (x == -128)
	{ printf("X overflowed! x = %d\n", x); }
	else
	{ printf("A miracle occurred, and x = %d\n", x); }
}
It produces the answer: "X overflowed! x = -128", even though I just added 1 to 127.

Historically, there have been a lot of important software errors caused by C programmers planning poorly for variable size and failing to anticipate overflow in their programs.

7) What does the "MIPS embedded system" exactly mean?

MIPS is a processor architecture family, just as x86, PowerPC, and ARM are processor architecture families. Most of the students in this class have worked primarily with desktop and laptop PCs with Intel or AMD x86 family processors in past courses. After Apple computers switched from PowerPC chips for the G3 and G4 machines to x86 CPUs, x86 became the overwhelmingly dominant architecture for general purpose PCs.

While our first assignment is on x86-based Linux machines, most of our assignments will be constructing an embedded operating system for the MIPS processor that is embedded in commodity Linksys WRT54GL home networking appliances. I'll talk more about this in lab next week.

It is important to realize that while x86 dominates desktop, laptop, and server markets, the vast majority of embedded systems, including smart phones, medical devices and vehicle processors, use other architectures like MIPS, ARM, and others.

8) Why did you use "man 3 pow" instead of "man pow"?

The UNIX man pages contain many chapters, and some short names refer to several different topics. For example, when I was trying to look up the 'exp' standard library function, I typed "man exp". I got the first thing named 'exp' in the man pages, which was not the C library function I was looking for, but instead was a UNIX utility by the same name for calculating a different kind of expression.

When there are multiple sections that contain entries for the same name, you can specify which one you want by section number. I happened to know off the top of my head that C library functions were in section 3 of the man pages on current Linux machines. When I looked up the 'pow' function, I just specified the section number out of habit.

You can lookup all the sections that contain a string in their name or their description using "man -k <item>", like "man -k exp".

9) Has anyone goofed up their kernel by #including a .h file that recursively #included itself?

I have.

GC: That is why is is common to guard #include statements with #ifndef statements

10) Would WinSCP work as well as Putty?

The last time I used WinSCP, it supported securely copying files to and from UNIX servers, but I did not see support for remotely logging in to a command-line.

So, you can use WinSCP to copy your homework files over, but you still need a program like Putty to remotely compile and run your files (and I *insist* that you check that your work compiles and runs on the UNIX machines -- I don't care if it runs on your computer, but not on mine,) or to turn them in.

Questions from Friday's cards.

You guys are off the scale with card questions this week. Keep it coming.

Friday's demonstration code is available on Morbius at ~brylow/os/demo/caesar.c. Note that you have permissions to read the file, but not to write to it.

1) What exactly does End Of File mean? Does it stop after 257 characters, or will it go on forever?

The C library getchar() function can return any one of 257 possible values -- these are the 256 characters in the extended ASCII character set, plus the EOF marker. That is the number of possibilities, not a count of how many characters it will read. In a while loop such as the one in the class demonstration today, it will indeed go on indefinitely until the program is killed or the EOF value is returned.

The EOF value is returned by getchar() when it reaches the end of the file it is reading from. If the input is coming from the user typing at a keyboard, that is a special case, and there is no true end of the file -- the user can walk away from the keyboard for an arbitrary period of time and then come back and start typing again. If the user wants to indicate "end of file" to a program that is reading from the keyboard, the key combination "Ctrl-D" signals to the operating system to send the EOF value to the program. (Ctrl-Z if your OS is Windows.)

2) If the code line "a[x] = x++;" is considered inappropriate, what is the simpler alternative?

Either
        a[x] = x;
		x++;
or
		x++;
		a[x] = x;
depending on which one you meant.

3) Could you please explain again about left shift and right shift?

These bitwise operators are carefully explained in section 2.9 on pages 48 and 49 in the K&R book. The shift operators shift bytes left or right in the register, effectively multiplying (left shift) or dividing (right shift) by powers of two. If you've worked with a shift register in your hardware course, that is how this C operator is often implemented in the processor.

3) So x++ will cause programs to execute differently on different platforms? Will certain programs numerically have different outputs on different platforms?

Yes. C contains ambiguities that will allow legal programs to produce different answers on different platforms. This is not a feature.

The ++ increment operator is not a problem by itself. When combined into more complex expressions, such as "printf("%d %d\n", x++, x+5);" it is platform- and compiler-dependent whether the increment takes place before, after, or during other expression evaluations in the line. GC: Opportunities for different answers go up with the number of cores.

C specifies that x++ will increment x after the x is evaluated the first time, and that ++x will increment before. But C does *not* specify whether the increment takes place before or after the calculation of the x+5 expression. Java avoids this problem by specifying the order in which function call arguments and increment side-effects will be evaluated.

Thus, one should avoid combining the increment and decrement operators with more complex expressions in C.

4) What is the different between

  printf("%s", &c);
and
  printf("%s", c);
What is the purpose of having '&' operator?

In this context, the '&' functions not as the bitwise AND operator, but as the 'address-of' operator. The first line above passed the *address* of character c as the second argument of printf(), whereas the second line merely passes the value of character c.

Given the print format specifier "%s", which prints a string of characters starting at a particular address, the '&c' version is the more correct of the two. Passing in a character value (with a range of values between -128 and 127) where a 32- or 64-bit address is expected will almost certainly lead to a program crash or garbage being printed out.

But the "%s" format specifier also assumes the pointer argument is the address of a null-terminated string. Character c is but a single character, and unless it happens to be the null character, '\0', or be adjacent to a null byte in memory, the top printf() call will also probably print something crazy.

We will talk about character strings and pointers on Wednesday and Friday next week.

5) What is a situation where the bitwise operator NOT (~) would be useful.

When working with binary data, the bitwise complement of a number is often useful. If I had a binary mask constant that indicated which bits of a particular hardware register I wanted to set, I could clear those bits by ANDing the register with the bitwise complement of my mask:
	 #define MASK 0x000F
	 reg = reg & (~MASK);
or, more succinctly,
	reg &= (~MASK);

6) Does C have a do-while loop?

Yes. K&R p.63.

7) Why does initialization of the variable in the for loop happen outside of the loop?

In Java, you can say "for (int x = 0; i < 10; x++) ...", which both declares and initializes a new loop variable, x, which only exists for the lifetime of the loop. ANSI standard C did not have this feature -- it was added to Java in part because people were sick of not having it in C. In C, you must declare the variable outside the for-loop. There simply is not support for declaring the variable inside the loop construct. But you still initialize the variable inside of the loop:
	int x;
	for (x = 0; x < 10; x++) ...
Newer revisions of the C language, starting with the 1999 "C99" standard and C++ permit this, but we'll be using classical C for this course.

8) If we write our programs on a PC, how do we transfer them so that we can turn them in to the class dropbox on Morbius?

First, a note of caution. Be careful about doing your homework on some other platform, like a Windows PC or a Mac. It is fine to use your PC to edit the file on your home machine, but you need to do your compiling and testing on a Systems Lab Linux machine, whether remotely or in person. Why? The C compilers and system libraries on other platforms are similar to ours, but often not identical. (Consider the Ctrl-D vs. Ctrl-Z issue above.) It is your responsibility to ensure that your submitted homework runs correctly on the lab systems. I will not regrade your homework if it turns out that it "works" on your system, but you didn't try it on the lab machines, and some platform dependency prevents it from working when I grade it.

That said, copying files from your PC to the lab machines can be done with WinSCP or similar programs on Windows, or the "scp" secure copy command from a Mac or home Linux box. Or, of course, you can carry your work to the lab on a flashdrive, or mail it to yourself and save it from a browser in the lab.

9) How do you want our HW1 to handle different keys?

I don't understand this question. Please e-mail in more detail, or come see me in my office hours.

10) Unsure what the "int main(int argc, char *argv)" stuff is about.

This is how command-line arguments are passed from the OS to your program. There is extensive explanation and examples in section 5.10, starting at page 114 in K&R. Use the command-line arguments and the atoi() function to read in the cipher key for your program.

11) I understand how mod works, but could you explain again how it is used to wrap around the alphabet?

I actually didn't really explain this in class -- it is one of the primary thinky-bits I want you to work on for the assignment. But I'll give you a hint. For my Caesar Cipher example in class, when I typed "xyz", I got "{|}", instead of my expected answer, "abc". That's because I did not have code to wrap the alphabet. This snippet of code would add three to my lower-case alphabet letters, with the desired wrap-around:
	if ((c >= 'a') && (c <= 'z'))
	{
		c = (((c - 'a') + 3) % 26) + 'a';
	}

Questions from previous years:

If a 'process' is one that is currently executing, and if a PC has at most four cores, then how can there be more than four processes running at once?

"Running" has two meanings, depending on the time scale you consider:

1. If we are thinking of time on the order of a few clock cycles, a process is "running" if it owns the Program Counter. In the state diagram of Figure 3.2, its state is "Running." Only one process (per processor) is in this running state at a time.

2. If we are thinking of time on the order of tenths of seconds or longer, an entire collection of processes may go through the state diagram of Figure 3.2 so fast that it seems as though they are all "running" simultaneously.

What is defined by "external" in "external storage?"

"External" can be a little fuzzy, but generally, we mean off the bus on which the processor chip resides.

What is a "script?"

A "program" written at a level that calls modules we often consider as independent executable units. Often, they glue together separate applications that were not intended to cooperate. Scripts may be written in special languages, such as DOS batch files or Unix shell scripts, or they may be written in general-purpose languages such as Perl or Python.

How does the wait(NULL) executed in the parent process ensure that it waits until the child is complete before executing the following printf()?

"wait(NULL)" generates an operating system call that tells the operating system that the calling process should wait (hopefully in a "Waiting" state in Figure 3.2) until all of its children processes (if any) have terminated. There are MANY variants of wait().

What kind of hardware are we going to be programming in this class?

Linksys routers.

Will we use any kind of assembly language at all?

Several lines in one assignment.

Are all the tools we use in this class available on Linux?

They are all Linux tools in the lab. Some rely on special hardware and are not suitable for the general population of machines.

In what language is Windows programmed?

Mostly C. Portions are in many other languages.

What is the difference between specifying optimization zero and simply not including an optimization option?

See the default, which is NOT optimization level zero

 

 
  Marquette University. Be The Difference. Marquette | Corliss |