How to Decode UTF-8 in Python?

UTF-8 is a popular character encoding that can describe any character in the Unicode standard. It is used for electronic communication. In Python, we need to decode UTF-8 because it is a way of representing characters that are not part of the ASCII character set. Python provides various functions to decode UTF-8 encoded strings, such as the built-in decode() method for decoding UTF-8 encoded strings.

This post provides various ways to decode UTF-8 in Python using the below-listed contents:

  • Method 1: Using the decode() Method
  • Method 2: Using the open() Function
  • Method 3: Using the codecs.open() Function

Method 1: Using the decode() Method

The “decode()” method is utilized to convert one encoding scheme to the specified encoding scheme. This works opposite to the encode() function. This function accepts the encoding, such as UTF-8, ASCII, etc., of the encoding string to decode it and returns the original string.

The following example will help you understand this method:

Code:

sting_value = b"Python Guide!"
decoded_sting = sting_value.decode("utf-8")
print(decoded_sting)
  • The byte string is initialized by using the prefix “b” at the start of the string.
  • The “decode()” function is used to decode the byte object into a string by using the “utf-8” encoding as an argument.

Output:

The byte string has been decoded as a spring.

Method 2: Using the open() Function

The “open()” function opens the file in different modes such as ‘r’/read, ‘w’/write, ‘b’/binary, etc. The following example uses the “open()” function with the parameter “encoding= “utf-8”” as an argument to decode the input file:

Code:

with open(r"C:\Users\p\Documents\program\example.txt", "r", encoding="utf-8") as f:
    contents = f.read()
    print(contents)
  • The “with open()” function accepts the complete file path and mode “r” and encoding=”utf-8” as an argument to decode the byte object into a string.
  • The “f.read()” function is used to read the file’s contents.

Output:

The file has been decoded using the “UTF-8” encoding.

Method 3: Using the codecs.open() Function

The “codecs” module in Python encodes and decodes data. The code below decodes the “UTF-8” using the codecs module function “codecs.open()”:

Code:

import codecs
with codecs.open(r"C:\Users\p\Documents\program\example.txt", "r", "utf-8") as f:
    contents = f.read()
    print(contents)
  • The module named “codecs” is imported.
  • The “codecs.open()” function opens a file named “example.txt” in read mode using the utf-8 encoding. 
  • The read() function reads the file’s contents.

Output:

The file has been decoded using the “UTF-8” encoding.

Conclusion

To decode UTF-8, the “decode()” method, the “open()” function, and the “codecs.open()” function of the codecs module is used in Python. The “decode()” method takes the encoding, such as UTF-8, ASCII, etc., of the encoding string to decode and return the original string. Similarly, the “open()” function and “codecs.open()‘ function is used to decode the “UTF-8” by using the encoding parameter. This post presented multiple ways to decode UTF-8 in Python using numerous examples.