Rob's View of Right-Shifting in 2C

Background

Rob is taking 311, and gave me the following thoughts on 2C. His background is in math, so I suppose math types like to think along these lines.

Here was the problem. Recall that bit-shifting an int to the right K bits is equivalent to dividing the value by 2^K (using integer division, which truncates any fractional part).

When you shift a non-negative int to the right, the bits fall off the least significant end (the right side), while 0's are shifted in on the most significant end (the left side).

However, when you shift a negative int right, 1's are shifted in from the most significant end. Shifting in the 1's in from the left makes sure you are dividing the negative int correctly by 2^K. However, it's far from obvious that shifting in the sign bit for negative ints (in 2C) is the "right" thing to do.

Representing Signed Numbers

Rob suggested the following approach to represent numbers in 2C. Given some bit pattern, let U be the value of the bitstring if the bits are interpreted in UB. Let S be the value if the bitstring is interpreted in 2C.

We can define the value of S by using the value of U. We'll see why this is useful momentarily. Assume N bits are used in the bitstring representing the number.

   S =   U,          if U < 2^N-1 - 1 (non-negative S)
         U - 2^N,     if U >= 2^N-1 - 1 (negative S)

We'll call this "Rob's canonical format" or RCF, for short.

So the question is why is the signed value U - 2^N? Recall from reading the class notes on 2C, that one way to negate a value, X, (other than flipping bits and adding 1) is to subtract from 2^N where you consider 2^N and X is interpreted as UB numbers. In other words, - X = 2^N - (UB) X That is, the negation of the value of X is 2^N - (UB) X where (UB) X value of X if you interpret the bits of X (which were in 2C) as if they were UB. It looks like a casting of signed to unsigned.

For example, if N = 4 and X is -2 then, you'd write X as 1110 in 2C. When performing the subtraction 2^N - (UB) X, you'd write 10000 - 1110 = 0010 where 10000 is 2⁴, and we interpret 1100 as 12, instead of -2 (i.e., interpreting it in UB instead of 2C). The subtraction yields 2, which is the negation of -2. Recall the subtraction is negating the value of X.

However, if we want the value of X, instead of its negated value (which is positive), then we need to to negate the formula, as in: -(2^N - X).

Thus, recall that -X (where X is -2) is computed by 10000 - 1110 = 0010 (which is +2), then negating both sides gives us -(10000 - 1110) = -0010, which is negative 2, the original value of X.

The formula we've written (-(2^N - X)) is of the form -(A - B) which is equal to B - A.

Thus, -(2^N - X) can be rewritten as: X - 2^N, which is the same as the definition above (except we used U instead of X).

This will be the Rob's canonical format for negative int values using N bits. That is, we can determine the value of S (when it's negative) by knowing the value of U---S = U - 2^N.

Dividing by 2

Let's see what happens when you divide S by 2, assuming S is negative.

  S / 2 =  ( U - 2^N ) / 2
        =  U/2 - 2^N/2     Distribute division
        =  U/2 - 2^N-1      Simplify
        =  U/2 - 2^N-1 + ( 2^N - 2^N )   Add well-chosen 0
        =  U/2 + ( - 2^N-1 + 2^N ) - 2^N  Use associativity
        =  ( U/2 + 2^N-1 ) - 2^N Simplify

We've now rewritten S/2 in Rob's canonical format, which has the form: X - 2^N. In this case, X = U/2 + 2^{N - 1} .

Let's think about what U/2 + 2^{N - 1} looks like. If you have an unsigned number U and you divide it by 2, and it's written in UB, then effectively, you shift the bits of the unsigned number to the right by 1. Thus, U/2 is just a right-shifted version of U (zeroes are shifted in from the left, since it's unsigned).

Adding 2^{N - 1} now puts a "1" in bit position b_N-1. So, this part acts like the "shifting" in of the 1 bit for the MSb.

So we can think of dividing a signed value, S by 2, as right-shifting the unsigned version U by 1 (which is U/2) and shifting in 1 from the left (which is equivalent, in this case, to adding 2^{N - 1}).

Dividing by 2^k

Dividing by 2^k just means repeating this process k times (which should effectively, move the bits over 1 at a time, while shifting in additional 1's).

Conclusion

A lot of how we understand how things work depends on representing the information in a way that we can reason about. Thus, even though it doesn't seem obvious why right-shifting a negative value requires 1's to be shifted in from the left, and even though it doesn't seem obvious why U - 2^N is a way to think of negative signed values, the representation helps us figure out why you add a 1 to right shifts.

Mathematics provides us ways to reason about the world, and this is one way it can help us understand 2C better, and after all, don't you want to understand 2C better?