Most people agree that you use unsigned binary (UB) as the representation system for unsigned int. For machines with 32-bit words, you use 32-bit UB.
However, the intuitive choice for signed int is not the way machines stored signed int. Most people would think of using signed magnitude (SM). However, SM and 1C (one's complement) have problems. Both representation systems have two representations for 0. On the whole, this isn't such a big problem, but it does make hardware more complicated.
We're going to study two's complement (2C) representation. This is the representation system that is used to represent signed int on machines. If you declare an int variable, the computer stores the values using 2C.
At first, you're going to wonder why we use 2C. It's going to seem like a bizarre representation. Yet, it has several nice properties. In particular, it has a single zero, and you can add two 2C numbers using the same hardware as you do for adding two UB numbers, and get the correct 2C answer (provided the answer doesn't overflow, i.e., it lies within the range of permissible 32-bit 2C values). You can't do that with 1C or SM. You'd have to build specialized hardware to do addition for 1C or SM that's different from the hardware for UB.
Here's a great example of a useful representation. 2C has nice properties as a representation, which is why we use it.
We want to negate B. Thus, if B is negative, then -B is non-negative. If B is non-negative, then -B is negative.
We are negating the 2C representation of B. That is, we're telling you how to manipulate the bits so the corresponding value is the negative of the value before changing the bits.
To negate in 2C, you flip the bits and add 1.
In symbols, -B = ~B + 1 where ~B = \bN-1 \bN-2...\b0. Thus, ~B is B with all bits flipped (thus, \bi means having bit bi flipped).
To add 1, we treat the ~B as if it were a UB number, and then add 1 as if both numbers were UB. If ~B is all 1's, then the result is an N+1 bit number, which is 1 followed by N zeroes. In this case, we ignore the 1, and assume the result is N zeroes.
Does this happen for 2C? Can we flip bits and add 1 to B, then do it again, and get back B? It's not obvious that flipping bits and adding 1 twice gets back the original bitstring. However, it does, and we'll see why.
The most common mistake is flipping bits and adding 1 when converting a non-negative number to 2C. That's wrong! If you have a non-negative number, just write it in UB, and you're done! You only flip bits and add one if the number you're trying to convert is negative.
As an exercise, convert all numbers from -8 to 7 to 4-bit 2C.
A function is 1-1 if it maps a unique y for every x. In other words, no two x values map to the same y value. A function is onto if every y in the codomain comes from some x in the domain. It may be possible that two or more x values map to the same y for an onto function. A function can be both 1-1 and onto.
If a function is both 1-1 and onto, then it has an inverse. We write the inverse with a -1 superscript. Thus, the inverse of f is written as f-1.
What is an inverse? We know that f maps x to y. Is there a function that maps in the reverse direction? That is, is there a function that maps y back to x? The answer is yes, provided that f is 1-1 and onto. If it is, then x = f-1(y).
Why talk about functions with inverses? We want the ability to negate representations. For SM, we flip the sign bit. For 1C we flip all the bits. We also want the property that negating a bitstring (in a representation system) twice gives back the same bitstring. Clearly, negating in SM twice gives back the same bitstring (we flip the sign bit twice), and the same with 1C (we flip all bits twice).
Negation is a function that maps a bitstring to another bitstring that represents the negated value of the original bitstring's value. To get the original value back, we want to apply the same negation operation a second time. In effect, we want to map y back to x.
Thus, negation is its own inverse function. If neg is the negation function (in some representation system), then we want neg = neg-1. That is, we want the inverse of the negation function to be itself. That way, applying the negation function twice, gives back the original bitstring.
Let's see an example of this. For example, consider 10100. The rightmost 1 is b2 (recall we number bits right to left starting at index 0). Flip all bits left of this. We get 01100. This gives you the same result as flipping the bits (01011) and adding 1 (01100).
Why do both methods produce the same result? It's not hard to understand, once you think about it. Let's look closely at what's happening. In particular, pay attention to the rightmost 1 and all bits to its right. I've highlighted them here 10100. I've highlighted bits 100. It's always a 1 followed by zero or more 0's.
Now see what happens when we flip bits. We get 01011. In words, we go from 1 followed by several zeroes to 0 followed by several 1s.
We add 1. This produces 01100. Thus, we're back to 1 followed by several zeroes. However, none of the bits left of these 3 changed! That is b4b3 didn't change values when we added 1. But they were flipped when we originally flipped bits. Thus, flipping bits and adding 1 has the effect of flipping all bits left of the rightmost 1. That's because when you flip the rightmost 1 and all 0's you get 0 followed by all 1's, but adding 1 to this gets you back to 1 followed by zeroes!
It makes sense once you understand how binary addition works.
We know that if we flip all bits of a number, then do it a second time, we get back the same bitstring (this is negation in 1C). It should be fairly obvious that if we flip a subset of those bits, and then flip it again, we get back the original bitstring. In this case, we flip all bits bk where k>i and i is the smallest index where bi=1. That is, we flip bits left of the rightmost 1. If you do that twice, you get back the same bitstring.
I find it's easier to convince people that negating twice gives you back the same 2C representation, if you just notice that you are flipping bits left of the rightmost 1, instead of saying that it's flipping bits and adding 1.
Practically speaking though, it's faster to flip bits and add 1 on most processors than to intentionally try to flip bits left of the rightmost 1. Even though both produce exactly the same result, you want to "code it up" using the flip bits, add 1 method.
-B = 1(0N) - BWe write 1(0N) to mean 1 followed by N zeroes.
That is, to negate B, subtract from 1, followed by N zeroes. Do this subtraction as if both numbers were unsigned.
Why is subtracting from 1(0N) the same as flipping bits and adding 1?
Consider any number, e.g., X = 1011. Flip the bits, and you get ~X = 0100. Add the two together (X + ~X), and you get 1111. Thus adding a bitstring (say, X) to a negated bitstring (i.e., ~X) produces a bitstring that is all 1's.
More precisely, adding (X + ~X) equals 1N (assuming X is an N bit number) where 1N is our notation for N ones.
Now add 1 to (X + ~X) and you get 1(0N). For example 1111 + 1 = 10000. That is, 1 + 1N = 1(0N) (read: adding 1 to N ones gives you 1 followed by N zeroes).
Thus, if we have B + -B = 1(0N), where -B = ~B + 1, as defined earlier.
Let's see why negating twice gives us back the original value. That is, why --B = B.
Suppose Y = 1(0N). Thus, to negate B means to compute Y - B. Suppose that results in C. Thus, C = Y - B.
How do we negate C? We use the definition from before. Subtract from Y! That means, we need Y - C. However, we already know that C = Y - B. With a little algebra (add B to both sides, subtract C from both sides), you get B = Y - C. That is, negating C gives you back B again.
So, once you know that negation is subtracting from Y, then it's not hard to see, with a little algebra, that subtracting twice from the same number gives you the original number back.
The inverse of adding one is subtracting one. That is, if you add 1 to a number, then to get back the original, you subtract one. Thus, the inverse is: flip-1 o addOne-1 = flip o subtractOne. Thus, you subtract 1, then flip bits. In this case, we subtract using modular arithmetic. Thus, subtract 1 from 0000 results in 1111.
Since negation is its own inverse, then this inverse function is equivalent to flipping bits and adding 1.
If you drive backwards in a car, most odometers will not change. However, if you pedal backwards on an exercise bike, some odometers will go in reverse. Let's imagine you have such a bike. Let's assume the odometer shows it's distance in binary.
You start of at 0000. If you pedal forward 3 km (let's pedal in metric), the odometer will read 0011. On the other hand, if you start of at 0000 and pedal backwards three kilometers, you would get 1111 (that's back one step), then 1110 (back two steps), and finally 1101 (back three steps). Thus, 1101 could be thought of as -3 since we went backwards 3 kilometers.
Is 1101 the representation for -3 in 2C? Let's see!
Try negating 3 in 2C. Start with 0011. Flip bits to get 1100. Add 1 to get 1101. That's the same number as we got by pedalling backwards 3 kilometers. It turns out this is true of any negative value. If you want to get, say, -7, pedal backwards 7 kilometers (starting at 0000) on a binary odometer, and that gives you the 2C representation.
(Of course, you might pick a number that can't be represented in the number of bits you give it. For example, with 4 bits 2C, you can't represent -20. If you pedal backwards 20 km, you'll end up with a representation that's not -20. This makes sense. After all, with only 4 bits, you can only represent 16 different values, and -20 isn't one of them).
What if you had 50 + 51? Normal decimal addition says you get 101. However, since you only have two digits, the answer has to be 01.
Modular arithmetic does computation modulo N, for some N. For example, with 2 decimal digits, we're going math modulo 100. Thus, 50 + 51 is really (50 + 51) % 100 = 1 where is the remainder operator. All operations in arithmetic modulo N are done with the remainder operator followed by N. That way, all answers fall within 0 and N-1.
Modular arithmetic is basically outside the scope of this course, but deals with doing math where numbers are confined within a range, and values wrap around. Thus, in modular arithmetic modulo 100, -1 is the same is 99, -2 is the same as 98, and so forth. All integers fall into equivalence classes. The number of equivalence classes is equal to N.
Courses in number theory and perhaps abstract algebra would cover this kind of arithmetic, and is the basis behind 2C.
Notice that negating 0 in 2C returns back 0! That's very nice because that means we have a single representation for 0.
There's another number with the same property. That is, there's another N-bit number which you can write in 2C, where you flip its bits and add 1, and it gives you back the same representation. Do you know which representation it is?
It's 1, followed by N-1 0's (which we write as 1(0N - 1)).
Does it really work? We'll try it on a 4-bit 2C number.
It makes sense for 0000 to stand for 0 in 4-bit 2C. But what value should the representation, 1000, be? First of all, it makes sense for this number to represent a negative value because the sign bit is 1. If you're given a representation of a negative number in 2C, how do you find out what number it is? Flip the bits and add 1, then convert to base 10, and add a negative sign.
For example, if you were asked what value is 1100, you'd flip the bits to get 0011, then add 1 to get 0100. Treat this as a UB number, and convert to base 10 to get 4. Add a negative sign to get -4. Thus, 1100 is the 4-bit 2C representation of -4.
We run into a problem with 1000. Flip the bits and adding 1 just gives us back 1000. What are we supposed to do?
But let's follow the rule, which says that we should convert this number to base 10. We do so by treating 1(0N - 1) as an unsigned int. So, with 4 bits, you have 1000, which is 8 in base 10. According to the algorithm, we add a negative sign, which results in -8.
That turns out to be the value we want representation, 1000, to map to. Again, think about the binary odometer. If you go back 8 kilometers from 0000, you will end up at 1000. To convince yourself of that, realize that pedalling backwards 1 kilometer gets you to 1111.
Now focus on the last three bits. It's 111 (or 7). It will take 7 more kilometers backwards to get that to 000. Thus, 1 km back (to get to 111), plus 7 more km (to get back to 1000) for a total of 8 steps, so 1000 is -8 (or 8 kilometers backward).
One nice thing about 2C, so far, is that we have a single zero representation. Recall that both SM and 1C have two representations for zero (positive and negative zero), and that two zero representations generally creates problems that a single zero representation doesn't.
Does 2C solve the addition problem? That is, can we add two values in 2C get the correct 2C representation for the sum?
The answer turns out to be yes.
Take two numbers like -1 + 2, and you will see normal addition works. For example, 1111 is -1. 0010 is +2. Add the two and you get 0001 (again, we always throw out the additional bit that would otherwise make it a 5 bit answer).
Now why does adding two numbers represented in 2C work? Let's consider adding X + Y. If both are non-negative, then it's unsigned addition, so that clearly works. If one is negative, and the other is positive, then just imagine the odometer set at the negative value, and adding a positive number. Intuitively, it still adds correctly.
The only one that may not intuitively make sense is when you add two negative numbers. Let X and Y both be negative numbers written in 2C.
Therefore, we can write X as [1(0N) - -X] and Y as [1(0N) - -Y]. Recall that one way to write 2C for negative values is to subtract the positive value from 1 followed by N zeroes (for an N bit 2C representation).
If you add the two, you get [1(0N+1) - (-X + -Y)]. This is almost 2C, except that you subtract from 1 followed by N+1 zeroes, instead of N zeroes. In other words, we're writing this value in 2C for N + 1 bits, rather than N bits.
The key for this to work is to realize that adding two negative values results in a negative value. In particular, it's the negation of (-X + -Y) (where -X and -Y are both positive because X and Y are both negative).
However, as long as -X + -Y doesn't get too large, this is equivalent to (1(0N) - (-X + -Y). That is, it's equivalent to writing the value in 2C for N bits.
To understand that, let's look at an example. Consider -1. In 4 bits, you write this as 1111. You can also write this as (10000 - 1). So, if you add -1 to -1, that's the same as (10000 - 1) + (10000 - 1), which is (100000 - 2). Now 100000 - 2, is 11110, which is 5 bits. As long as -X and -Y aren't too large, the most significant bit of the addition will always be 1. So, if we simply ignore that 1, then 100000 - 2 (the 2C representation for -2 using 5 bits) is the "same" as 10000 - 2 (the 2C representation for -2 using 4 bits), which is 1110.
It's not that unreasonable to ignore that additional, leftmost 1. Consider representing -2 using 4 bits. That's 1110. Consider representing -2 with 6 bits. That's 111110. The only difference between the two representations is the additional 1's that appear in the upper bits.
In fact, to convert any number in 2C from N bits to N + K bits, just add K bits to the left of the value, where the bit's value is taken from the sign bit, i.e., the MSb. This is called sign extension.
Clearly, sign extension works for positive/zero values since you merely add 0's to the left. Thus, 0010 maps to the same value as 000010. Adding more zeroes doesn't change what value it maps to.
It turns out it works when the sign bit is negative. We just saw an example where 1110 maps to -2, and 111110 also maps to -2 (using 2C in 4 and 6 bits respectively).
Thus, (10N+1 - (-X + -Y) is really 2C using N + 1 bits, which is fortunately equivalent to 2C using N bits provided (-X + -Y) isn't too large.
To determine the number of values, realize that half the representations have 1 as the MSb, and the other half have 0. All numbers that have 1 as the MSb are indeed negative. In particular, there are no negative zeroes.
The smallest negative number is -1. Since there are 2N-1 negative representations with N bits (which is half the total 2N representations), then the largest magnitude negative number must be -2N-1. Notice there's not a "- 1" at the end.
Why not? Suppose I tell you the smallest number of K consecutive ints is 0. What's the largest number? It's K - 1. However, if the smallest number is 1, then the largest is K. Same thing applies here. All we've done is add a negative sign in front. We have 2N-1 values, so K = 2N-1. Thus, the range is from -1 to -2N-1, where a minus sign is placed in front of each number.
There are also 2N-1 representations with 0 as the MSb. However, one of those representations is zero. That leaves 2N-1 - 1 representations for positive values, starting at +1. The range of positive values is therefore 1 to 2N-1 - 1, inclusive.
That means that there's one more negative value than positive value. In particular, we can't negate -2N-1. It lies outside the range of positive values, i.e., it overflows. That's the price we pay for only having a single zero. However, this is a minor problem given the other nice properties of 2C.
However, 2C has all of the nice properties. It has a single zero. You can add two 2C numbers using hardware for unsigned int addition.
This is the representation that's used when you declare:
int x ; // This is stored in 2C
| Number of Bits | Number of Values | Min Value | Max Value |
| 16 | 216 | -(215 - 1) | 215 - 1 |
| 32 | 232 | -231 | 231 - 1 |
| N | 2N | 2N-1 | 2N-1 - 1 |
This chart differs from 1C and SM, because there's a 1-1 mapping of representations to values, i.e., there aren't two representations for the value 0. This affects the minimum and maximum value given N bits.
The mistake that some people make is to take non-negative values and flip bits and add 1. Thus, when asked to represent +2 in 2C, some people write 0010 (which is +2), then flip bits to get 1101, then add 1 to 1110. They've succeeded in writing -2!
The act of negating a representated number in 2C requires you to flip bits and add 1. Equivalently, you can subtract 1, and flip bits, but most prefer to add. Equivalently, you can flip all bits left of the rightmost 1.