【My Study Note】Character Encoding
Character encoding is used to assign our binary values to characters so that we as humans can read them. We definitely wouldn’t want to see all the text in our emails and Web pages rendered in complex sequences of zeros and ones. This is where character encodings come in handy.
You can think of character encoding as a dictionary. It’s a way for your computers to look up which human characters should be represented by a given binary value.
ASCII
The oldest character encoding standard used is ASCII. It represents the English alphabet, digits, and punctuation marks.
The first character in ASCII to binary table, a lowercase a, maps to 0 1 1 0 0 0 0 1 in binary. This is done for all the characters you can find in the English alphabet as well as numbers and some special symbols.
The great thing with ASCII was that we only needed to use 127 values out of our possible 256. It lasted for a very long time, but eventually, 256 possible ways weren’t enough.
UTF 8
Then came UTF 8. The most prevalent encoding standard used today. Also, UTF 8 is built off the Unicode Standard.
Along with having the same ASCII table, it also lets us use a variable number of bytes. What do I mean by that? Think of any emoji. It’s not possible to make emojis with a single byte, so as we can only store one character in a byte, instead UTF 8 allows us to store a character in more than one byte, which means endless emoji fun.