SI540: Fall 1999

HW3 Solution Set

Junho Song(with notes from Prof. Paul Resnick)


Written exercise

1. As an archivist, you've been given the job of making some old computer records accessible to historians. The data is stored on an ancient (30-year-old) magnetic tape. Miraculously, you still have a tape reader available, the tape doesn't break, and you're able to transfer the files in question to a modern storage device (a hard disk). You know that the files in question contain text, in the English language, but apparently they were encoded in some representation other than ASCII. So, for now, each file is just a pile of bits. Describe what steps you would take to try to discover the representation scheme that was used, and thus recover the English text. (2 points)

Basically, you have to use a trial-and-error method. Since we do not know how many bits are used for encoding a character (assuming each character was encoded separately – > no compression), we have to try to figure out whether 7, 8, or even 16 bits encoding was used. For this task, we can try dividing bit string into certain bit size assuming the text might have been encoded with that bit size and test whether it can generate any meaningful text, looking up dictionary.

For testing bit string with specific bit size, we have to match characters to the bit strings and test whether it makes up a English word. If not, we have to start from the beginning again. For guessing in matching characters with bit string, we can also use frequencies of appearance of alphabet in English text. (e.g. If a certain bit string appears more than others, we can assume that they do not represent scarcely used character ‘Z’.) Once we have matching characters enough, we can relatively easily fill out rest characters.

In order to minimize this trial-and-error task, you could employ heuristics like frequency matching on bit strings. If the seven bit decoding is correct, there should be some variability in bit string just like variability of characters in English text. If not, we can move on to other bit sizes.

It is useful to have a program that can do all or part of the job mentioned above automatically so that we can choose one among few possible results that the program generates.


2. E4.2 How many different alternatives can be represented by the following numbers of bits?

a. One: either 0 or 1 ---> 21 = 2

b. Two: 00 01 10 11 ---> 22 = 4

c. Three: 000 001 010 011 100 101 110 111 ---> 23 = 8

d. n: 2n


3. E4.4 You are given the task of finding a representation for a circle in a drawing application. The circle is to be displayed on the screen, and the user can edit the location and radius of the circle, both of which are expressed in units of pixels.

  1. Specify a data representation for the circle, trying to minimize the number of bits required.

(Technique 1: Vector Representation)

If you think about a circle’s property, you have center point coordinates (x,y) and radius(unit of pixels in this question). These should be represented in bit strings. In doing so, you should think about some other issues related to representation. What is the maximum size of circle you might want to represent? If it is small, you can use fewer bits to represent the radius.

(Technique 2: Bitmap Representation)

Alternatively, you could use a bitmap representation, where pixels around the circumference would be black(1) and others would be white(0). Like all bitmaps, you can run a compression algorithm to reduce the number of bits in the representation.


b. Describe how data processing would draw a circle on the screen, starting with stored data, following your representation.

(Technique 1: Vector Representation)

First, the computer software has to interpret encoded data of the circle in order to reproduce it from center coordinates and radius. In order to do it, software has to know how many bits are devoted to each of the three properties. Then, the computer software can automatically calculate the position of each pixel on the circumference. In order to generate a bitmap, the bitmap image of the calculation result will be sent to the output device for display.

From author’s note: More specifically, the software would locate the pixel position of the center, and then use that point and the radius of the circle to calculate the location of the pixels on the circumference of the circle and turn them on. (In more mathematical terms, the equation for a circle with center at point (a,b) and radius r is (x-a)2 + (y-b) 2 = r2. The software would enter various values of x between (a-r) and (a+r) into this equation to find the corresponding y values, and then plot them on bitmap.)

(Technique 2: Bitmap Representation)

The representation is already a bitmap (after decompressing) and can be sent directly to the output device.

4. E4.5 For a personal computer you use every day:

Suggested answers from the textbook author

a. Building blocks of a PC:

Application software: Word processor, browser software, spreadsheet, etc.

Infrastructure software: The operating system (Windows, DOS, Macintosh OS).

Storage: Memory (RAM), floppy disk, hard disk.

Communications: Modem, local area network connection.

b. Microprocessor: Supports all of the above building blocks.

Floppy disk: Supports storage (and sometimes infrastructure and application software).

Hard disk: Supports storage, infrastructure software(virtual memory paging), and application software(storing application itself and its data).

Network interface card: Supports communications and infrastructure software.

Modem: Supports communications.

5. "Your Honor…" Write Up

You should describe what functions of browser and operating system summarizing the article you read. You have to argue whether the browser is a legitimate part of the operating system or not with supporting ideas. In doing so, you should think about and describe what are the cost/benefit from the perspective of various parties (consumers, software developers, administrators and etc.)


Explanation exercise

Explain how images and sounds are represented as bit strings in computers. As always, include a couple of sentences saying who you talked to, how you explained it, and how the explainee responded.

Generally, each pixel of the image is represented in bit strings containing information about that pixel color. Usually, first few bit string contains information about how much color depth it uses so that the correct interpretation of the image can be performed later. So, depending on the color depth of the image, size of the bit strings for a pixel is determined.

Sound is sampled using a specific rate. For each sample, a pre-specified number of bits encodes the pressure level. For higher pitched sounds, more frequent samples are needed.