UTF-8 Encoder/Decoder
Free online UTF-8 encoder/decoder tool, Unicode codepoint conversion
What is UTF-8 Encoding?
UTF-8 is a variable-width character encoding that represents every character in the Unicode standard. It uses one to four bytes per character and is the dominant encoding for the World Wide Web. ASCII characters (0-127) use one byte, making UTF-8 backward compatible with ASCII.
How to Use This UTF-8 Encoder/Decoder
Enter text to see its Unicode code points in U+XXXX format. Enter code points to decode them back to text. This tool helps you inspect and debug character encoding issues, especially with multilingual content and special characters.
▶What is a Unicode code point?
A code point is a unique number assigned to each character in the Unicode standard. For example, U+0041 is 'A', U+4F60 is '你'. Code points range from U+0000 to U+10FFFF.
▶How is UTF-8 different from UTF-16?
UTF-8 uses 1-4 bytes per character and is ASCII-compatible. UTF-16 uses 2 or 4 bytes per character. UTF-8 is the most common encoding for web content, while UTF-16 is used internally by some systems like Windows and JavaScript.
▶Why do some characters show as question marks or boxes?
This usually means the font does not contain a glyph for that Unicode code point. The character is correctly encoded but cannot be displayed by the available fonts on your system.