Computer memory saves all data in digital form. There is no way to store characters directly. Each character has its digital code equivalent: this is called ASCII code (for American Standard Code for Information Interchange). Basic ASCII code represented characters as 7 bits (for 128 possible characters, numbered from 0 to 127). In the 1960s, ASCII code was adopted as the new standard. With it, characters can be coded using 8 bits, for 256 possible characters.
Codes 0 to 31 are not used for characters They are called control characters, because they are used for actions like:
Codes 65 to 90 stand for uppercase letters and codes 97 to 122 stand for lowercase letters
(Changing the 6th bit switches uppercase to lowercase; this is equivalent to adding 32 to the ASCII code in base-10.)
|Character||ASCII Code||Hexadecimal Code|
|SOH (Start of heading)||1||01|
|STX (Start of text)||2||02|
|ETX (End of text)||3||03|
|EOT (End of transmission)||4||04|
|TAB (Horizontal tabulation)||9||09|
|LF (Line Feed)||10||0A|
|VT (Vertical tabulation)||11||0B|
|FF (Form feed)||12||0C|
|CR (Carriage return)||13||0D|
|SO (Shift out)||14||0E|
|SI (Shift in)||15||0F|
|DLE (Data link escape)||16||10|
|DC1 (Device control 1)||17||11|
|DC2 (Device control 2)||18||12|
|DC3 (Device control 3)||19||13|
|DC4 (Device control 4)||20||14|
|NAK (Negative acknowledgement)||21||15|
|SYN (Synchronous idle)||22||16|
|ETB (End of transmission block)||23||17|
|EM (End of medium)||25||19|
|FS (File separator)||28||1C|
|GS (Group separator)||29||1D|
|RS (Record separator)||30||1E|
|US (Unit separator)||31||1F|
ASCII Code was developed for use with the English language. It does not have accented characters, or language-specific characters. To encode such a character, a different code system is needed. ASCII code was extended to 8 bits (a byte) in order to be able to encode more characters (this is also known as Extended ASCII Code). This code assigns the values 0 to 255 (coded as 8 bits, that is, 1 byte) to uppercase and lowercase letters, digits, punctuation marks and other symbols (including accented characters, in the code iso-latin1).
Extended ASCII code is not standardized, and varies depending on which platform is used.
The two most commonly used extended ASCII character sets are:
EBCDIC code (short for Extended Binary-Coded Decimal Interchange Code), developed by IBM, is used for encoding characters with 8 bits. Though widespread on IBM computers, it has not been as successful as ASCII code.
Unicode is a 16-bit character encoding system developed in 1991. Unicode can express any character as a 16-bit code, no matter what operating system or programming language is used.
It includes almost all current alphabets (among them Arabic, Armenian, Cyrillic, Greek, Hebrew, and Latin) and is compatible with ASCII code.