Unicode is a standard for encoding text across multiple writing systems. MariaDB 5.5 supports a number of character sets for storing Unicode data:
Character Set | Description | Added |
---|---|---|
ucs2 | UCS-2, each character is represented by a 2-byte code with the most significant byte first. Fixed-length 16-bit encoding. | |
utf8 | UTF-8 encoding using one to three bytes per character. Basic Latin letters, numbers and punctuation use one byte. European and Middle East letters mostly fit into 2 bytes. Korean, Chinese, and Japanese ideographs use 3-bytes. No supplementary characters are stored. | |
utf8mb3 | Currently an alias for utf8. | MariaDB 5.5 |
utf8mb4 | Same as utf8, but stores supplementary characters in four bytes. | MariaDB 5.5 |
utf16 | UTF-16, same as ucs2, but stores supplementary characters in 32 bits. 16 or 32-bits. | MariaDB 5.5 |
utf32 | UTF-32, fixed-length 32-bit encoding. | MariaDB 5.5 |
© 2019 MariaDB
Licensed under the Creative Commons Attribution 3.0 Unported License and the GNU Free Documentation License.
https://mariadb.com/kb/en/unicode/