Python 3.7 所带来的新特性 | 隔叶黄莺 Yanbin Blog

2020-07-24 | 阅读(724)

Python 接触的晚，所以接着体验一下 Python 3.8 带来的主要新特性继续往前翻，体验一下 Python 3.7 曾经引入的新特性，爱一门语言就要了解她真正的历史。一步一步慢慢给 Python 来个起底。

先来看看 Python 网站的各版本使用情况 Usage statistics of Python Version 3 for websites, 这里统计的 Python 开发的网站的数据，应该有 Python 3 大规模的用于其他领域。单网站应用 Python 来说，Python 2 还有大量遗留代码，Python 3 还是 3.6 为主，Python 的升级还任重道远。本人也是谨慎的在从 3.7 迁移到 3.8 的过程中，AWS 的 Lambda 都支持 3.8，直接上 3.8 也没什么历史负担。以下是从网站使用 Python 统计情况中的两个截图

Python 3.7.0 发布于 2018-06-27, 这篇文章 Cool New Features in Python 3.7 详细介绍了 Python 3.7 的新特性，本文也是从其中挑几个来体验体验。

`breakpoint()` 进入调试器

这个功能好像没什么卵用，现在随手一个 IDE 都能断点调试，大不了临时加些 print 语句，把 breakpoint() 语句留在代码中也是个垃圾。不管它呢，既然是个新特性，顺道看下了，就是说在代码中加行 breakpoint()，代码执行到该处就会默认进入 PDB(Python Debugger) 调用会话。

# bug.py
e = 1
f = 2
breakpoint()
r = e / f
print(r)

# bug.py

e = 1

f = 2

breakpoint()

r = e / f

print(r)

用 python 3.7 bug.py 执行，然后看到

$ python3.7 bug.py
> /Users/yanbin/bug.py(4)<module>()
-> r = e / f
(Pdb) e
1
(Pdb) c
0.5

$ python3.7 bug.py

> /Users/yanbin/bug.py(4)<module>()

-> r = e / f

(Pdb) e

(Pdb) c

0.5

参考 PDB 的用法，比如输入变量名可以查看它的值，c 继续执行。

breakpoint() 是之前的 import pdb; pdb.set_trace() 的缩减写法。

如果要跳过代码中的所有 breakpoint() 停顿，可设置 PYTHONBREAKPOINT=0

$ PYTHONBREAKPOINT=0 python3.7 bug.py
0.5

1 2	$ PYTHONBREAKPOINT=0 python3.7 bug.py 0.5

是不是没多大可能用得上它啊。

数据类

这可是个大趋势，像在 Java 中 Playframwork 曾给 public 属性自动生成 getter/setter 方法，还有用 Lombok 来辅助的，直到 Java 14 出现了 record 类，Scala 的 case class，Kotlin 中也有 data class 类型 -- 一枚典型的 Javaer。所以 Python 也有了类似的实现，@dataclass 让我们从 __init__ 构造函数中一个个写 self.field_name = field_name 中挣脱出来，并且会自动生成一些其他的双下划线方法。

from dataclasses import dataclass, field

@dataclass(order=True)
class Country:
    name: str
    population: int
    area: float = field(repr=False, compare=False)
    coastline: float = 0

    def other_method(self):
        pass

from dataclasses import dataclass, field

@dataclass(order=True)

class Country:

name: str

population: int

area: float = field(repr=False, compare=False)

coastline: float = 0

def other_method(self):

pass

上面创建了一个 Country 数据类，我们需要指定每个字段的类型，或默认值或其他的描述，Python 会把字段收集起来生成一个构建函数

class Country:

    def __init__(self, name, population, area, coastline=0):
        self.name = name
        self.population = population
        self.area = area
        self.coastline = coastline

......

class Country:

def __init__(self, name, population, area, coastline=0):

self.name = name

self.population = population

self.area = area

self.coastline = coastline

......

用 c.name 来访问属性，同时它还为我们生成了诸如 __repr__, __eq__, __ne__, __lt__, __le__, __gt__, __ge__ 实现方法

我们曾经需要用 collections.namedtuple 来实现类似的行为。这里有一个关于 dataclass 详细的介绍 The Ultimate Guide to Data Classes in Python 3.7。

但是有一点缺憾是 Python 的数据类不能直接被 json 序列化， json.dumps(c) 会得到错误：TypeError: Object of type Country is not JSON serializable。

类型提示强化和延迟注解求值

Python 3.5 开始引入了类型提示，Python 在这方面还在不断的演化，Python 3.7 中下面的代码不能通过

class Tree:
    def __init__(self, left: Tree, right: Tree) -> None:
        self.left = left
        self.right = right

class Tree:

def __init__(self, left: Tree, right: Tree) -> None:

self.left = left

self.right = right

在解析构造函数的时候认为 Tree 类型还没有正式建立起来，提示错误

NameError: name 'Tree' is not defined

只有给类型提示用引号括起来，把它们当字符串来看待就行，在 IDE 中还不引影响代码提示

Python 3.7 中其实也不必要写成 left: 'Tree', 加上 from __future__ import annotations 就行

from __future__ import annotations

class Tree:
    def __init__(self, left: Tree, right: Tree) -> None:
        self.left = left
        self.right = right

from __future__ import annotations

class Tree:

def __init__(self, left: Tree, right: Tree) -> None:

self.left = left

self.right = right

上面的代码顺利通过，加上 from __future__ import annotations 就是让 left: Tree 类型提示能延迟求值

再一个例子，类型提示不光是类型，字符串描述，还可以是一条语句，如

# anno.py
def greet(name: print("Now!")):
    print(f"Hello {name}")

# anno.py

def greet(name: print("Now!")):

print(f"Hello {name}")

name: print("Now!") 这样的类型提示会在解释该方法放入命名空间的时候求值，即 import 就会打印出信息

>>> import anno
Now!
>>> anno.greet.__annotations__
{'name': None}

>>> import anno

Now!

>>> anno.greet.__annotations__

{'name': None}

因为 print("Now!") 的返回值为 None, 提示的 name 类型也就为 None

同样的，引入 from __future__ import annotations 还能禁止 print("Now!") 的求值，anno.py 的内容如下

from __future__ import annotations

def greet(name: print("Now!")):
    print(f"Hello {name}")

from __future__ import annotations

def greet(name: print("Now!")):

print(f"Hello {name}")

再到 Python REPL 中试下

>>> import anno
>>> anno.greet.__annotations__
{'name': "print('Now!')"}
>>> anno.greet("Marty")
Hello Marty

>>> import anno

>>> anno.greet.__annotations__

{'name': "print('Now!')"}

>>> anno.greet("Marty")

Hello Marty

根本就不对 print("Now!") 求值

Python 的类型提示还可以更复杂，而且 IDE 还能推算出它的实际的提示类型，看 PyCharm 中的提示

str_type() 函数返回的是 str 类型，所以 name. 能提示出 str 类型的方法。

时间精度的提高

Python 3.7 对 time 模块的某些函数上增加了 xxx_ns() 函数的支持，返回的是纳秒，并且类型为 int 而非原来的 float 类型，float 本质上不准确，而 Python 无限的 int 类型则更为优越。那些函数是

clock_gettime_ns(): 返回指定时钟时间
clock_settime_ns(): 设置指定时钟时间
monotonic_ns(): 返回不能倒退的相对时钟的时间（例如由于夏令时）
perf_counter_ns(): 返回性能计数器的值，专门用于测量短间隔的时钟
process_time_ns(): 返回当前进程系统和用户 CPU 时间的总和（不包括休眠时间）
time_ns(): 返回自 1970 年 1 月 1 日以来的纳秒数

字典的顺序是有保证的

Python 3.7 开始输出的字典顺序与放入 key 的顺序是一致的。Python 3.6 只是说字典的顺序基本可以保证，但不能过于依赖)。

>>> {"one": 1, "two": 2, "three": 3}
{'one': 1, 'two': 2, 'three': 3}

1 2	>>> {"one": 1, "two": 2, "three": 3} {'one': 1, 'two': 2, 'three': 3}

我们多数时候不应该依赖于字典的顺序的，这一特性可以想像是相当于由 HashMap 实现为 LinkedHashMap。

`async` 和 `await` 终于成了关键字

Python 3.5 开始引入了 async 和 await 语法，却未把它们当作关键字，也许是为了一个过度，所以在 Python 3.7 之前 async 和 await 可以用作变量或函数名。Python 3.7 开始就不被允许它们挪为它用了。

`asyncio.run()` 简化事件循环

在 Python 3.7 之前支持协程要显示的使用事件循环，比如使用如下代码

# Python 3.7 之前
import asyncio

async def hello_world():
    print("Hello World!")

loop = asyncio.get_event_loop()
loop.run_until_complete(hello_world())
loop.close()

# Python 3.7 之前

import asyncio

async def hello_world():

print("Hello World!")

loop = asyncio.get_event_loop()

loop.run_until_complete(hello_world())

loop.close()

Python 3.7 开始有了 asyncio.run() 方法，代码就变为

import asyncio

async def hello_world():
    print("Hello World!")

asyncio.run(hello_world())

import asyncio

async def hello_world():

print("Hello World!")

asyncio.run(hello_world())

这里的 asyncio.run() 就做了前方代码 loop 三行的事情。用 asyncio.run() 稍有不便之处就是总是需要定义一个入口执行函数。

上下文变量(ContextVar)

它类似于线程本地存储, 和 Java 的下面几个概念对照起来就好理解了

ThreadLocal，set(value), get()
日志框架(如 SLF4J) 的 MDC.getCopyOfContextMap() 和 setContextMap

为了更好理解，需要放到多线程环境中去演示它，用下面的 ThreadPoolExecutor 代码

from contextvars import ContextVar
from concurrent.futures import ThreadPoolExecutor
from threading import current_thread
import time

name = ContextVar("name", default='world')

def task(num):
    time.sleep(2)
    if name.get() == 'world':
        name.set(f'world #{num}')
    print(f'{current_thread().name}: name = {name.get()}')

with ThreadPoolExecutor(3) as executor:
    for i in range(10):
        executor.submit(task, i)

from contextvars import ContextVar

from concurrent.futures import ThreadPoolExecutor

from threading import current_thread

import time

name = ContextVar("name", default='world')

def task(num):

time.sleep(2)

if name.get() == 'world':

name.set(f'world #{num}')

print(f'{current_thread().name}: name = {name.get()}')

with ThreadPoolExecutor(3) as executor:

for i in range(10):

executor.submit(task, i)

10 个任务重用三个线程，发现 name.get() 的值为默认的 world 才重设为 world <序号>, 执行后看到如下输出

ThreadPoolExecutor-0_2: name = world #2
ThreadPoolExecutor-0_0: name = world #0
ThreadPoolExecutor-0_1: name = world #1
ThreadPoolExecutor-0_2: name = world #2
ThreadPoolExecutor-0_0: name = world #0
ThreadPoolExecutor-0_1: name = world #1
ThreadPoolExecutor-0_2: name = world #2
ThreadPoolExecutor-0_0: name = world #0
ThreadPoolExecutor-0_1: name = world #1
ThreadPoolExecutor-0_2: name = world #2

发现只要重要线程时还能看到之前的值，也就是说 name 的值是绑定在当前线程上的。

第二段代码，演示了如何应用指定的上下文变量去运行代码

import contextvars
from threading import current_thread 
from concurrent.futures import ThreadPoolExecutor

name = contextvars.ContextVar("name", default='name1')
address = contextvars.ContextVar("address", default='address1')

def task(num):
    print(num, current_thread().name, name.get(), address.get())

name.set('name2')
address.set('address2')
ctx = contextvars.copy_context()

with ThreadPoolExecutor(1) as executor:
    executor.submit(task, 1)
    executor.submit(lambda : ctx.run(task, 2))
    executor.submit(task, 3)

import contextvars

from threading import current_thread

from concurrent.futures import ThreadPoolExecutor

name = contextvars.ContextVar("name", default='name1')

address = contextvars.ContextVar("address", default='address1')

def task(num):

print(num, current_thread().name, name.get(), address.get())

name.set('name2')

address.set('address2')

ctx = contextvars.copy_context()

with ThreadPoolExecutor(1) as executor:

executor.submit(task, 1)

executor.submit(lambda : ctx.run(task, 2))

executor.submit(task, 3)

执行效果如下：

1 ThreadPoolExecutor-0_0 name1 address1
2 ThreadPoolExecutor-0_0 name2 address2
3 ThreadPoolExecutor-0_0 name1 address1

使用一个单线程的线程池，使得每次任务都重用同一个线程，分别解释每一次的执行效果：

第一个任务使用本地默认的 name 和 address 值，分别为 name1 和 address1
第二个任务使用事先从主线程获得的 contextvars.copy_context() 作为上下文去执行 task, 所以打印出的是主线程上的变量值 name2 和 address1
第三个任务同样是打印出的默认值 name1 和 address1, 说明上一个任务不会覆盖当前线程的上下文变量值

小结

基本上对我比较有用的新特性就这些了，还有一个开发者技巧，用 python3.7 -X importtime my_script.py 就能看到所有与 my_script.py 相关联的模块的导入时间，从而发现巨慢的模块加载予以优化。

总体说来没多大的惊喜，毕竟是一个小版本的更新，想要惊喜的话得看 Python 3.0 的 What's New。如果这以上新特性较为有用的也就数据类和上下文变量，但由于数据类不能被 JSON 序列化，在用作 Rest API 时还得转换为字典再序列化为 JSON。

本文链接 https://yanbin.blog/python-3-7-what-is-new/, 来自隔叶黄莺 Yanbin Blog

2 Comments

Inline Feedbacks

View all comments

Python 3.10 关键新特性 | 隔叶黄莺 Yanbin Blog - 软件编程实践

2 years ago

[…] Python 3.7 所带来的新特性 […]

Python 3.9 新特性回顾 | 隔叶黄莺 Yanbin Blog - 软件编程实践

Polo on 想选一种动态语言＋跨平台界面组件的组合，希望大家给点意见Perl + Tkx
best coffee on SciPy 最优化之最小化I wanted to take a moment to commend you on the outstanding quality of...
seetimee on 体验 Python FastAPI 的并发能力及线, 进程模型感谢
Yanbin on Mockito 3.4.0 开始可 Mock 静态方法有一个补救，新写了一篇 https://yanbin.blog/mockito-mock-static-method-in-multiple...
Yanbin on 升级到 Spring Boot 3 后 javax.inject.Named 不可用怎么，被抄袭了！算是被机器翻译引用的？

M	T	W	T	F	S	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

breakpoint() 进入调试器

数据类