2000: python2
2008: python3
2010: python2.7
2020: python2.7 end of life
python3 new things
ipaddress
venv
asyncio
unicode
future
from __future__ import absolute_import, division, generators, unicode_literals, print_function, nested_scopes, with_statement
The __future__
statement is not a import
statement, it must be put at the very top.
After import the _future__
, we can use new features in old version of python.
The above listed future remains the same in python2.7 and python3.
First, we import all these future and make it work in python2. Then we port it to python3.
str, bytes, unicode
In python2, str
and bytes
are the same, and they are different from unicode
.
>>> type(bytes('a'))
<type 'str'>
>>> type(str('a'))
<type 'str'>
>>> type(u'a')
<type 'unicode'>
In python3, they are different. str
is unicode
by default, and bytes
is binary data. (bytearray
is mutable bytes
.)
Unicode
is a type of character encoding(or "character map", "character set", they have the same meaning).
UTF-8
is Unicode Transformation Format
. If UTF-8
is a train, then unicode
is the payloads.
An example in python3, convert between bytes
and unicode string
>>> cd = '成都'
>>> byte_utf8 = cd.encode('utf-8')
>>> byte_utf8
b'\xe6\x88\x90\xe9\x83\xbd'
>>> byte_gb = cd.encode('gb2312')
>>> byte_gb
b'\xb3\xc9\xb6\xbc'
>>> byte_utf8.decode('utf-8')
'成都'
>>> byte_gb.decode('gb2312')
'成都'
In python2.7, importing unicode_literals
will set default string type to unicode.
>>> a = 'aa'
>>> type(a)
<type 'str'>
>>> from __future__ import unicode_literals
>>> a = 'aa'
>>> type(a)
<type 'unicode'>
>>> type(str(a))
<type 'str'>
Importing unicode_literals
does NOT change the str
type, it only makes the new created string object as a unicode
type.
So don't use str()
to convert object. Instead, use unicode()
.
But there is no unicode()
in python3, to make it compatible with python3, add the following code.
import sys
if sys.version_info >= (3, 0):
def unicode(*args, **kwargs):
return str(*args, **kwargs)
And there is another issue, in python3, convert bytes
to str
will let the b
mark left.
And convert data with bytes()
is also different.
>>> b = b'bbb'
>>> str(b)
"b'bbb'"
In python3, some network libs may use bytes as input/output data type, such as telnetlib and paramiko.
So you may need to convert the type from/to unicode.
It's suggested to use the following functions, they work in both python2 and python3.
def to_unicode(data):
if isinstance(data, (bytes, bytearray)):
return data.decode('utf-8')
else:
return unicode(data)
def to_bytes(data):
if sys.version_info >= (3, 0):
return bytes(data, 'utf-8')
else:
return bytes(data)
safer relative import
Suppose the package is:
mypackage/
__init__.py
submodule1.py
submodule2.py
and the code below is in submodule1.py
:
# Python 2 only:
import submodule2
# Python 2 and 3:
from . import submodule2
# Python 2 and 3:
# To make Py2 code safer (more like Py3) by preventing
# implicit relative imports, you can also add this to the top:
from __future__ import absolute_import
In python3, there is no print
keyword, only print()
function.
After from __future__ import print_function,
the print "xx"
in python2 will raise Exception.
So always use print()
.
division
python2
>>> 5 / 2
2
>>> 5 // 2
2
python3
>>> 5 / 2
2.5
>>> 5 // 2
2
After from __future__ import division
, the python2 action is the same as python3.
It's better to always use //
.
exception
Work both in python 2 and 3:
def test():
raise Exception('exception in test')
try:
test()
except Exception as e:
print(str(e))
iterable objects instead of lists
Some methods returning lists in python2 has changed to returning iterable objects in python3.
range()
dict.keys()
dict.values()
dict.items()
map()
filter()
range() and xrange()
The xrange()
in python2 has changed to range()
in python3.
import sys
if sys.version_info >= (3, 0):
def xrange(*args, **kwargs):
return iter(range(*args, **kwargs))
Don't use xrange()
, always use range()
.
next() function and .next() method
Always use next()
function, no .next()
method in python3.
>>> my_generator = (letter for letter in 'abcdefg')
>>> next(my_generator)
'a'
>>> next(my_generator)
'b'
for loop namespace
i = 1
print('before: i =', i)
print('comprehension: ', [i for i in range(5)])
print('after: i =', i)
In python2, i=4
after the for loop, but in python3, i=1
.
So don't use global variable as the for variable.
other modules
try:
import queue
from urllib.parse import urlparse, urlencode
from urllib.request import urlopen, Request
from urllib.error import HTTPError
except ImportError:
import Queue as queue
from urlparse import urlparse
from urllib import urlencode
from urllib2 import urlopen, Request, HTTPError
reference
How To Port Python 2 Code to Python 3
https://www.digitalocean.com/community/tutorials/how-to-port-python-2-code-to-python-3
The key differences between Python 2.7.x and Python 3.x with examples
http://sebastianraschka.com/Articles/2014_python_2_3_key_diff.html
Writing code that runs under both Python2 and 3
https://wiki.python.org/moin/PortingToPy3k/BilingualQuickRef
Cheat Sheet: Writing Python 2-3 compatible code
http://python-future.org/compatible_idioms.html
What's REALLY New in Python 3
https://powerfulpython.com/blog/whats-really-new-in-python-3/
Unicode HOWTO
https://docs.python.org/3/howto/unicode.html