Your parking spot is based on the first 3 digits of your student ID number.
What problem can occur? If you received your social security number (student ID) in Maryland, the first three digits of your student ID is likely to be in the low 200's.
Thus, there's a much higher chance someone will be in your parking spot. There's also a chance that parking spots numbered in the 900's might be mostly empty since few students may have student IDs with the first 3 digits in that range.
This is basically the direct-mapped scheme. The advantages of this scheme are that it's simple to find your parking spot. It can only be in one location.
How do we decide which slot the cache line associated with address A31-0 should go into?
Here's how we do it:
Since we have 128 slots, we need to specify which one slot we need the cache line to go in. This requires lg 128 = 7 bits. Where do we get the bits from? Directly from the address itself.
Bits A4-0 is still the offset. The slot number are the next 7 bits, Bits A11-5. The remaining bits, A31-12 is the tag.
Unlike a hash table, we do not resolve such collisions by finding the next free slot. Instead, we overwrite the value in the slot.
Just think of the parking lot analogy. A direct mapped scheme works poorly if many students have permits for the same spot, while other spots have very few permits.
The data you have might simply map to the same slot, and you could have cache lines going in and out all the time.
If the data isn't in the slot, it's obvious that this is the slot to get evicted.
The problem with picking the high bits is that data and program tend to occupy a small section of memory. Because of that, they tend to share the same high bits.
This isn't so surprising. Consider a 10,000 page book, where each page has 5 digits (thus, there are pages 03401 and pages 00023). Grab any 100 consecutive pages, and the top two digits are highly like to be the same.
This can happen with data too. All the data will have very similar high bits. You want the bits as low as possible, so you get enough variation so that data can go into each slot with equal likelihood (so you hope).
You can't pick the low bits, because you need them to serve as the offset.
The scheme can suffer from many addresses "colliding" to the same slot, thus causing the cache line to be repeatedly evicted, even though there may be empty slots that aren't being used, or being used with less frequency.