English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية
If you want to use python2If there is Chinese in the py file of python, then you must add a line of comment to declare the file encoding, otherwise python2will use ASCII encoding by default. (Python3This problem no longer exists, Python3The default file encoding is UTF-8)
The encoding comment must be placed on the first or second line. Generally, the first two lines of a Python file should be written like this:
#!/usr/bin/python # -*- coding: UTF-8 -*-
The first line specifies the Python interpreter, and the second line specifies the encoding of the Python file. There are the following optional methods to set the encoding:
1. The setting method with an equal sign:
#!/usr/bin/python # coding=<encoding name>
2. The most common, with a colon (most editors can recognize it correctly):
#!/usr/bin/python # -*- coding: <encoding name> -*-
3. vim's:}}
#!/usr/bin/python # vim: set fileencoding=<encoding name> :
The following are the functions of setting the encoding declaration in the header:
If there are Chinese comments in the code, this declaration is needed
Advanced editors (such as my emacs) will, according to the header declaration, take this as the format of the code file.
The program will decode the initialization u"Life is short" through header declarations, such as this unicode object, (so the header declaration and the storage format of the code must be consistent)
Set default decoding format
import sys # Import sys module, not the first load of sys reload(sys) # Reload sys Initialization will delete the sys.setdefaultencoding method, and we need to reload it8) # Call setdefaultencoding function
Here, it is particularly important to note the second line reload(sys), which cannot be omitted. Without it, the code cannot run correctly. Why do we need to reload, and why can't we call the function directly? Because the setdefaultencoding function is deleted after being called by the system. Therefore, when it is imported, it actually does not exist, so we must reload the sys module to make setdefaultencoding available, so that it can be modified in the code to change the current character encoding of the interpreter.
Under the Lib folder in the Python installation directory, there is a file called site.py, where you can find main() –> setencoding() –> sys.setdefaultencoding(encoding). Because this site.py is automatically loaded every time the Python interpreter starts, the main function will be executed every time, and the setdefaultencoding function will be deleted as soon as it is called.
Regarding sys.defaultencoding, this is used when the decoding method is not explicitly specified. For example, I have the following code:
#! /usr/bin/env python # -*- coding: utf-8 -*- s = '中文' # Note that here str is of str type, not unicode s.encode('gb18030')
This code will re-encode s to gb18030 format, that is, perform unicode -str conversion. Because s itself is of str type, therefore
Python will automatically decode s to unicode first, and then encode it to gb18030. Because decoding is automatically performed by Python, we did not specify the decoding method, and Python will use the method indicated by sys.defaultencoding to decode. In many cases, sys.defaultencoding is
ANSCII, if s is not this type, it will fail. Taking the above situation as an example, my sys.defaultencoding is anscii, and the encoding method of s is consistent with the file encoding method, which is utf8 because it is wrong:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0: ordinal not in range(128)
For this situation, there are two methods to correct the error:
The first method is to explicitly indicate the encoding method of s
#! /usr/bin/env python # -*- coding: utf-8 -*- s = '中文' s.decode('utf-818030')
The second method is to change sys.defaultencoding to the file encoding method
#! /usr/bin/env python # -*- coding: utf-8 -*- import sys reload(sys) # Python2.5 Initialization will delete the sys.setdefaultencoding method, and we need to reload it-8 str = '中文' str.encode('gb18030')
The above method of setting the file encoding format for Python shared by the editor is all the content I want to share with everyone. I hope it can give you a reference, and I also hope that everyone will support the Yelling Tutorial.
Statement: The content of this article is from the Internet, and the copyright belongs to the original author. The content is contributed and uploaded by Internet users spontaneously. This website does not own the copyright, has not been manually edited, and does not assume any relevant legal liability. If you find any content suspected of copyright infringement, please send an email to: notice#oldtoolbag.com (Please replace # with @ when sending an email for reporting, and provide relevant evidence. Once verified, this site will immediately delete the content suspected of infringement.)