Self Python - python2 and python3 codec

In Python2Compile and installWhen the parameters can be --enable-unicode = ucs2 or --enable-unicode = ucs4 are designated uses 2 bytes or 4 bytes, a unicode character, use 4 bytes to python3 a default unicode character, can not be changed

python2 default ucs2 standard?

View the current environment python unicode string space:

import sys
print(sys.maxunicode)
 Ucs2 # 65535 standard, a 2 byte unicode character
 # 111,411 represents ucs4 standard, four bytes represent a unicode character

Encoding and decoding:

Is the codec is essentially a mapping relationship, such as' a 'is expressed by encoded ascii 65, 00110101 is stored in a computer, the computer displayed the time search table corresponding to go ascii is' a', is displayed as' a '

 

ASCII and UTF-8:

ASCIIBit in one byte 8 bits represent a character, first all 0, the character set is not enough;

unicodeFor expression of the coding system is designed in any language, in order to prevent redundancy in memory, such as ascii code corresponding to the portion, a variable length coding using Unicode, and therefore makes it difficult to decode, is determined not a few bytes character;

utf-8Prefix encoding unicode is directed to a variable length coding design, several bytes is determined according to the prefix represents a character, if the first bit is a 0 byte, a byte that indicates a character, if the first bit is 1, 1 is the number of consecutive indicates how many bytes to represent a character

 

When the computer working memory unicode encoding data has been represented, and the like will be used utf-8 encoding operation when storing data to disk or over a network when

In python2, there are two types of the string: str and unicode. str deposit bytes of data, unicode village unicode data

# coding:utf-8
# python2
 s1 = 'you'
 s2 = u 'you'
print(type(s1)) # str
print(repr(s1)) # '\xe4\xbd\xa0'
print(type(s2)) # unicode
print(repr(s2)) # u'\u4f60'

It used to store and transmit data byte, unicode data to display the plain text

Whether or gbk utf-8 is an encoding rule used to encode the data into byte data unicode

 

In python3 also only two types of string: str and bytes

str unicode deposit type, corresponding to a unicode python2; bytes type memory bytes, corresponding to the python2 str

Python3 not proceed in the automatic decoding of the bytes bytes, always Unicode text, represented by a type str; binary bytes represented by the

 

Problem: The following code will complain in python2, normal output will python3


 print ( 'Hello')

Reason: python interpreter is similar to a text editor, python2 default ASCII code, python3 default UTF-8, can be viewed as follows:

import sys
 # Python2 output ascii, python3 output utf-8
print(sys.getdefaultencoding())

Thus, when executed by an interpreter python2 utf-8 error decoding failure occurs, the default encoding is python3 interpreter utf-8 will not be given.

In the first line plus #coding: utf-8 is not to tell the interpreter to decode the default encoding, but with the utf-8 to decode

Therefore, when encoding utf-8 format code to run when python2 appears normal:

# coding:utf-8
# python2
import sys
print(sys.getdefaultencoding()) #ascii
print ( 'Hello') # hello

However, when the windows to execute the command line program by python python xxx.py, because the default position cmd window gbk encoding, the data type is string python2 str, bytes corresponding to the type python3; data strings in python3 type unicode

Data conversion relationship:

bytes (python2 of str) -> decode decoding -> str (python2 of unicode) -> encode coding -> bytes (python2 of str)

Therefore cmd window, python2 bytes string corresponding to the type of python3, I said before,It used to store and transmit data byte, unicode data to display the plain text, So there needs to be decoded to decode unicode can be normal, if the decoding operation is not carried out, the interpreter python2 cmd window will automatically default to gbk decoding distortion is displayed. Specified in the code will be displayed correctly decoded in utf-8

# coding:utf-8
import sys
print(sys.getdefaultencoding()) # ascii
 a = 'Hello'
 print (type (a)) # str (equivalent python3 in bytes)
 print (a.decode ( 'utf-8')) # hello

The string is directly python3 str (unicode) type need not decode the decoding can be directly displayed

Intelligent Recommendation

Python and coroutine from Python2-Python3

Personal blog navigation page (clickRightlinkTo open the personal blog):Daniel takes you to the technology stack  Coroutine, also known as micro-thread, fiber, English name Coroutine; in one sent...

[python] How to be compatible with python2 and python3

You can use @python_2_unicode_compatible Example: Original code To python_2_unicode_compatible will automatically do some processing to adapt to different versions of python. The unicode_literals in t...

python2 and python3 coexist python launcher

I am using python recently, and need to switch between py2 and py3 environments for different requirements. After consulting related materials, some students installed both environments and then chang...

[Python] Python2 is different from Python3

Foreword As a glue language that understands various languages, Python has been widely used in system development, web development, network reptile, data mining, deep learning, etc. in its own advanta...

Python Translate () method Python2 and Python3

Article catalog 1 Overview 2. Syntax 3. Python2, Python3 difference 4. Usage appendix What is the difference between Python2 and Python3 | Comprehensive Summary | Professional Passage 1 Overview The P...

More Recommendation

[Python] Python2 and Python3 version coexist

Description At present, it is basically Python3, but sometimes it will encounter a more old project, and then sad reminder is that you need Python2, it is easy to destroy the current Python3 environme...

Python | Python3 and Python2 Some differences

coding Python2 encoding: ASCII code Python3 encoding: UTF-8 BOOL value Python2 uses 0 and 1 to show true and false, false and true are not a keyword, which can be used as a variable name. Python3 can ...

Python3 coexists with python2 --centos7-python

I. Installing Dependent Package and Tools Two. Download Python installation package Domestic Taobao mirror:Python Mirror (taobao.org) III. Compilation and installation Decompression Upgrade PIP3...

Python 3TO2 2TO3-Python3 turn Python2 Python2 turn Python3

Python - 3TO2 | 2to3 - Python3 turn python2 | Python2 turn python3 Article catalog Python - 3TO2 | 2to3 - Python3 turn python2 | Python2 turn python3 3 turn 2 3to2 installation use py-backwards instal...

Ubuntu default python version switch (python2 and python3)

Some scripts on the computer are python2, some are python3, but the system default is python2, you need to set environment variables to switch. Switch python2 to python3: Switch python3 to python2: Vi...

Copyright  DMCA © 2018-2026 - All Rights Reserved - www.programmersought.com  User Notice

Top