Character data isn't just alphabetic characters, but also numeric characters, punctuation, spaces, etc. Most keys on the central part of the keyboard (except shift, caps lock) are characters.
As we've discussed with signed and unsigned ints, characters need to represented. In particular, they need to be represented in binary. After all, computers store and manipulate 0's and 1's (and even those 0's and 1's are just abstractions---the implementation is typically voltages).
Unsigned binary and two's complement are used to represent unsigned and signed int respectively, because they have nice mathematical properties, in particular, you can add and subtract as you'd expect.
However, there aren't such properties for character data, so assigning binary codes for characters is somewhat arbitrary. The most common character representation is ASCII, which atands for American Standard Code for Information Interchange.
There are two reasons to use ASCII. First, we need some way to represent characters as binary numbers (or, equivalently, as bitstring patterns). There's not much choice about this since computers represent everything in binary.
If you've noticed a common theme, it's that we need representation schemes for everything. However, most importantly, we need representations for numbers and characters. Once you have that (and perhaps pointers), you can build up everything you need.
The other reason we use ASCII is because of the letter "S" in ASCII, which stands for "standard". Standards are good because they allow for common formats that everyone can agree on.
Unfortunately, there's also the letter "A", which stands for American. ASCII is clearly biased for the English language character set. Other languages may have their own character set, even though English dominates most of the computing world (at least, programming and software).
Since they are contiguous, it's usually easy to determine whether a character is lowercase or uppercase (by checking if the ASCII code lies in the range of lower or uppercase ASCII codes), or to determine if it's a digit, or to convert a digit in ASCII to an int value.
0 nul 16 dle 32 sp 48 0 64 @ 80 P 96 ` 112 p
1 soh 17 dc1 33 ! 49 1 65 A 81 Q 97 a 113 q
2 stx 18 dc2 34 " 50 2 66 B 82 R 98 b 114 r
3 etx 19 dc3 35 # 51 3 67 C 83 S 99 c 115 s
4 eot 20 dc4 36 $ 52 4 68 D 84 T 100 d 116 t
5 enq 21 nak 37 % 53 5 69 E 85 U 101 e 117 u
6 ack 22 syn 38 & 54 6 70 F 86 V 102 f 118 v
7 bel 23 etb 39 ' 55 7 71 G 87 W 103 g 119 w
8 bs 24 can 40 ( 56 8 72 H 88 X 104 h 120 x
9 ht 25 em 41 ) 57 9 73 I 89 Y 105 i 121 y
10 nl 26 sub 42 * 58 : 74 J 90 Z 106 j 122 z
11 vt 27 esc 43 + 59 ; 75 K 91 [ 107 k 123 {
12 np 28 fs 44 , 60 < 76 L 92 \ 108 l 124 |
13 cr 29 gs 45 - 61 = 77 M 93 ] 109 m 125 }
14 so 30 rs 46 . 62 > 78 N 94 ^ 110 n 126 ~
15 si 31 us 47 / 63 ? 79 O 95 _ 111 o 127 del
The characters between 0 and 31 are generally not printable (control
characters, etc). 32 is the space character.
Also note that there are only 128 ASCII characters. This means only 7 bits are required to represent an ASCII character. However, since the smallest size representation on most computers is a byte, a byte is used to store an ASCII character. The MSb of an ASCII character is 0.
Sometimes ASCII has been extended by using the MSb.
00 nul 10 dle 20 sp 30 0 40 @ 50 P 60 ` 70 p
01 soh 11 dc1 21 ! 31 1 41 A 51 Q 61 a 71 q
02 stx 12 dc2 22 " 32 2 42 B 52 R 62 b 72 r
03 etx 13 dc3 23 # 33 3 43 C 53 S 63 c 73 s
04 eot 14 dc4 24 $ 34 4 44 D 54 T 64 d 74 t
05 enq 15 nak 25 % 35 5 45 E 55 U 65 e 75 u
06 ack 16 syn 26 & 36 6 46 F 56 V 66 f 76 v
07 bel 17 etb 27 ' 37 7 47 G 57 W 67 g 77 w
08 bs 18 can 28 ( 38 8 48 H 58 X 68 h 78 x
09 ht 19 em 29 ) 39 9 49 I 59 Y 69 i 79 y
0a nl 1a sub 2a * 3a : 4a J 5a Z 6a j 7a z
0b vt 1b esc 2b + 3b ; 4b K 5b [ 6b k 7b {
0c np 1c fs 2c , 3c < 4c L 5c \ 6c l 7c |
0d cr 1d gs 2d - 3d = 4d M 5d ] 6d m 7d }
0e so 1e rs 2e . 3e > 4e N 5e ^ 6e n 7e ~
0f si 1f us 2f / 3f ? 4f O 5f _ 6f o 7f del
The difference in the ASCII code between an uppercase letter
and its corresponding lowercase letter is 2016.
This makes it easy to convert lower to uppercase (and back) in hex (or binary).
This may seem like a particularly odd feature. What does it mean to have a signed or unsigned char?
This is where it's useful to think of a char as a one byte int. When you want to cast char to an int (of any size), the rules for sign-extension may apply. In particular, if the MSb of a char is 1, then casting it to an int may cause this 1 to sign extend, which may be surprising if you're not expecting it.
Of course, you may observe "but how would the MSb get the value 1?". If you recall, char is one of the data types that you can manipulate with bitwise and bitshift operators. This means you can set or clear any bit of a char. In particular, you can set or clear the MSb of a char.
Another way you might get 1 for the MSb is casting an int down to a char. Usually, this means truncating off the upper bytes, leaving the least signficant byte. Since an int can have any bit pattern, there's a possibility that the least significant byte has a 1 in bit position b7.
You should think of a char as both an ASCII character as well as an 8 bit int. This duality is important because char is the only data type that is 1 byte. There are no other data types that are 1 byte.
One problem with ASCII is that it's biased to the English language. This generally creates some problems. One common solution is for people in other countries to write programs in ASCII.
Other countries have used different solutions, in particular, using 8 bits to represent their alphabets, giving up to 256 letters, which is plenty for most alphabet based languages (recall you also need to represent digits, punctuation, etc).
However, Asian languages, which are word-based, rather than character-based, often have more words than 8 bits can represent. In particular, 8 bits can only represent 256 words, which is far smaller than the number of words in natural languages.
Thus, a new character set called Unicode is now becoming more prevalent. This is a 16 bit code, which allows for about 65,000 different representations. This is enough to encode the popular Asian languages (Chinese, Korean, Japanese, etc.). It also turns out that ASCII codes are preserved. What does this mean? To convert ASCII to Unicode, take all one byte ASCII codes, and zero-extend them to 16 bits. That should be the Unicode version of the ASCII characters.
The biggest consequence of using Unicode from ASCII is that text files double in size. The second consequence is that endianness begins to matter again. With single bytes, there's no need to worry about endianness. However, you have to consider that with two byte quantities.
While C and C++ still primarily use ASCII, Java has already used Unicode. This means that Java must create a byte type, because char in Java is no longer a single byte. Instead, it's a 2 byte Unicode representation.
The file does NOT storing 123. Instead, it stores the ASCII code for the character '1', '2', and '3' (which is 31, 32, 33 in hex or 0011 0001, 0011 0010, 0011 0011 in binary).
ASCII files store bytes. Each byte is the ASCII code for some character in the character set. You can think of a text editor as a translator. It translates those binary numbers into symbols on the screen. Thus, when it sees 4116, that's the ASCII code for 'A', and thus 'A' gets displayed.
Some people think that if they type 0's and 1's in a text editor, they are writing out bits into a binary file. This is not true. The file contains the ASCII code for the character '0' and the character '1'.
There are hex editors which allow you to either type in binary or more commonly in hex. Those hex pairs are translated to binary. Thus, when you write F3, the binary number 1111 0011 is written to the file (the space is placed there only to make the binary number easy to read).