KAIST
EE 209: Programming Structures for EE

Assignment 4: Assembly Language Programming

(This assignment is borrowed and slightly modified from Princeton COS 217)

Purpose

The purpose of this assignment is to help you learn about computer architecture, assembly language programming, and testing strategies. It also will give you the opportunity to learn more about the GNU/Unix programming tools, especially bash, emacs, gcc209, and gdb for assembly language programs.


Rules

Part b is "on your own" part, and will take up 20% of the grade.


A Word Counting Program in Assembly Language

Part a: Translate to Assembly Language

The Unix operating system has a command named wc (word count). In its simplest form, wc reads characters from the standard input stream until end-of-file, and prints to the standard output stream a count of how many lines, words, and characters it has read. A word is a sequence of characters that is delimited by one or more white space characters.

Consider some examples. In the following, a space is shown as s and a newline character as n.

If the file named proverb contains these characters:

Learningsissan
treasureswhichn
accompaniessitsn
ownerseverywhere.n
--sChinesesproverbn

then the command:

$ wc < proverb

prints this line to the standard output stream:

  5 12 82

If the file proverb2 contains these characters:

Learningsissan
treasureswhichn
accompaniessitsn
ownerseverywhere.n
--sssChinesesproverb

(note that the last "line" does not end with a newline character) then the command:

$ wc < proverb2

prints this line to standard output:

  4 12 83

The file mywc.c contains a C program that implements the subset of the wc command described above. Translate that program into assembly language, thus creating a file named mywc.s. It is acceptable to use global (i.e. bss section or data section) variables in mywc.s, but we encourage you to use local (i.e. stack) variables instead. Your assembly language program should have exactly the same behavior (i.e. should write exactly the same characters to the standard output stream) as the given C program.

Part b: Test

Design a test plan for your mywc program. Your test plan should include tests in three categories: (1) boundary testing, (2) statement testing, and (3) stress testing.

Create text files to test your programs. Name each such file such that its prefix is mywc and its suffix is .txt. The command ls mywc*.txt should display the names of all mywc test files, and only those files.

Describe your mywc test plan in your readme file. Your description should have this structure:

mywc boundary tests:

mywcXXX.txt: Description of the characteristics of that file, and how it tests boundary conditions of your mywc program.
mywcYYY.txt: Description of the characteristics of that file, and how it tests boundary conditions of your mywc program.
...

mywc statement tests:

mywcXXX.txt: Description of the characteristics of that file, and which statements of your mywc program it tests. Refer to the statements using the line numbers of the given mywc.c program.
mywcYYY.txt: Description of the characteristics of that file, and which lines of your mywc program it tests. Refer to the statements using the line numbers of the given mywc.c program.
...

Your descriptions of the test files should be of the form "This file contains such-and-such characteristics, and so tests lines such-and-such of the program." Please identify the lines of code tested by line numbers. The line numbers should refer to the given C code.

mywc stress tests:

mywcXXX.txt: Description of the characteristics of that file, and how it stress tests your mywc program.
mywcYYY.txt: Description of the characteristics of that file, and how it stress tests your mywc program.
...

Submit test files that contain no more than approximately 50000 characters. Submitting very large files could exhaust the course's allotted disk space on Moodle, and so could prohibit other students from submitting any files at all.

Submit test files that contain only printable ASCII characters. Specifically, make sure your computer-generated test files contain only characters having ASCII codes (in hexadecimal) 09, 0A, and 20 through 7E. Submitting test files that contain other characters would make it difficult for your grader to examine those files.

Finally, create a Bash shell script named testmywc.sh to automate your mywc test plan. A Bash shell script is simply a text file that contains commands, and that has been made executable via the chmod command, for example, chmod u+x testmywc.sh.

The testmywc.sh script should build a mywc program from the given C code, build a mywc program from your assembly language code, execute both programs, and compare the output.

It is acceptable for your testmywc.sh script to call other scripts that you create. Each such script should have a name that is prefixed with testmywc. The command ls testmywc* should display the names of all mywc test scripts, and only those files.

Feel free to use the test_regex.sh Bash shell script from Assignment 2 as models.


Logistics

Develop on lab machines. Use emacs to create source code. Use gdb to debug.

Do not use a C compiler to produce any of your assembly language code. Doing so would be considered an instance of academic dishonesty. Instead produce your assembly language code manually.

We encourage you to develop "flattened" C code (as described in precepts) to bridge the gap between the given "normal" C code and your assembly language code. Using flattened C code as a bridge can eliminate logic errors from your assembly language code, leaving only the possibility of translation errors.

We also encourage you to use your flattened C code as comments in your assembly language code. Such comments can clarify your assembly language code substantially.

You should submit:

Your readme file should contain:

Submit your work electronically using the commands:

mkdir 20091234_assign4
mv mywc.s testmywc.sh anyBashScriptsCalledByTestmywc.sh mywc*.txt readme 20091234_assign4
tar zcf 20091234_assign4.tar.gz 20091234_assign4

Please do not submit emacs backup files, that is, files that end with '~'.


Grading

As always, we will grade your work on quality from the user's and programmer's points of view. To encourage good coding practices, we will deduct points if gcc209 generates warning messages.

Comments in your assembly language programs are especially important. Each assembly language function -- especially the main function -- should have a comment that describes what the function does. Local comments within your assembly language functions are equally important. Comments copied from corresponding "flattened" C code are particularly helpful.

Your assembly language code should use .equ directives to avoid "magic numbers." In particular, you should use .equ directives to give meaningful names to:

Testing is a substantial aspect of the assignment. Approximately 20% of the grade will be based upon your mywc test plan as described in your readme file, and as implemented by your test scripts and data files.