Python调试方法及性能调试

文章由LinuxBoy分享于2019-04-01 01:04:43热评（405）

Python调试方法及性能调试

python pdb调试

python -m pdb myscript.py #注意这会重启myscript.py

可以在程序中这么设置断点：
import pdb; pdb.set_trace()

可以修改变量的值，但是要注意，前面加上！比如要修改final的值，应该这样!final="newvalue"

支持的命令：
    p 打印变量
    n next
    step 细点运行
    c continue
    l list
    a args 打印当前函数的参数
    condition bpnumber [condition]
    clear/disable/enable 清除/禁用/使能断点
    q quit

python profiler性能分析

一种方法： if __name__ == "__main__": import profile profile.run("foo()")
另一种命令行方法：python -m profile prof1.py profile的统计结果分为ncalls, tottime, percall, cumtime, percall, filename:lineno(function)等若干列：

ncalls	函数的被调用次数
tottime	函数总计运行时间，除去函数中调用的函数运行时间
percall	函数运行一次的平均时间，等于tottime/ncalls
cumtime	函数总计运行时间，含调用的函数运行时间
percall	函数运行一次的平均时间，等于cumtime/ncalls
filename:lineno(function)	函数所在的文件名，函数的行号，函数名

用pstats自定义报表 profile解决了我们的一个需求，还有一个需求：以多种形式查看输出，我们可以通过 profile的另一个类Stats来解决。在这里我们需要引入一个模块pstats，它定义了一个类Stats，Stats的构造函数接受一个参数—— 就是profile的输出文件的文件名。Stats提供了对profile输出结果进行排序、输出控制等功能，如我们把前文的程序改为如下：

# …略 if __name__ == "__main__": import profile profile.run("foo()", "prof.txt") import pstats p = pstats.Stats("prof.txt") p.sort_stats("time").print_stats()

引入pstats之后，将profile的输出按函数占用的时间排序 Stats有若干个函数，这些函数组合能给我们输出不同的profile报表，功能非常强大。下面简单地介绍一下这些函数：

strip_dirs()	用以除去文件名前名的路径信息。
add(filename,[…])	把profile的输出文件加入Stats实例中统计
dump_stats(filename)	把Stats的统计结果保存到文件
sort_stats(key,[…])	最重要的一个函数，用以排序profile的输出
reverse_order()	把Stats实例里的数据反序重排
print_stats([restriction,…])	把Stats报表输出到stdout
print_callers([restriction,…])	输出调用了指定的函数的函数的相关信息
print_callees([restriction,…])	输出指定的函数调用过的函数的相关信息

这里最重要的函数就是sort_stats和print_stats，通过这两个函数我们几乎可以用适当的形式浏览所有的信息了，下面来详细介绍一下。 sort_stats() 接受一个或者多个字符串参数，如”time”、”name” 等，表明要根据哪一列来排序，这相当有用，例如我们可以通过用time为key来排序得知最消耗时间的函数，也可以通过cumtime来排序，获知总消耗时间最多的函数，这样我们优化的时候就有了针对性，也就事半功倍了。sort_stats可接受的参数如下：

‘ncalls’	被调用次数
‘cumulative’	函数运行的总时间
‘file’	文件名
‘module’	文件名
‘pcalls’	简单调用统计（兼容旧版，未统计递归调用）
‘line’	行号
‘name’	函数名
‘nfl’	Name/file/line
‘stdname’	标准函数名
‘time’	函数内部运行时间（不计调用子函数的时间）

另一个相当重要的函数就是print_stats——用以根据最后一次调用sort_stats之后得到的报表。 cProfile

python -m cProfile -s time test.py

timeit

如果我们某天心血来潮，想要向list里append一个元素需要多少时间或者想知道抛出一个异常要多少时间，那使用profile就好像用牛刀杀鸡了。这时候我们更好的选择是timeit模块。
timeit除了有非常友好的编程接口，也同样提供了友好的命令行接口。首先来看看编程接口。timeit模块包含一个类Timer，它的构造函数是这样的： class Timer( [stmt='pass' [, setup='pass' [, timer=<timer function>]]]) stmt参数是字符串形式的一个代码段，这个代码段将被评测运行时间；setup参数用以设置stmt的运行环境；timer可以由用户使用自定义精度的计时函数。 timeit.Timer有三个成员函数，下面简单介绍一下： timeit( [number=1000000]) timeit()执行一次Timer构造函数中的setup语句之后，就重复执行number次stmt语句，然后返回总计运行消耗的时间。 repeat( [repeat=3 [, number=1000000]]) repeat()函数以number为参数调用timeit函数repeat次，并返回总计运行消耗的时间 print_exc( [file=None]) print_exc()函数用以代替标准的tracback，原因在于print_exc()会输出错行的源代码，如：

>>> t = timeit.Timer("t = foo()/nprint t") ß被timeit的代码段 >>> t.timeit() Traceback (most recent call last): File "<pyshell#12>", line 1, in -toplevel- t.timeit() File "E:/Python23/lib/timeit.py", line 158, in timeit return self.inner(it, self.timer) File "<timeit-src>", line 6, in inner foo() ß标准输出是这样的 NameError: global name 'foo' is not defined >>> try: t.timeit() except: t.print_exc() Traceback (most recent call last): File "<pyshell#17>", line 2, in ? File "E:/Python23/lib/timeit.py", line 158, in timeit return self.inner(it, self.timer) File "<timeit-src>", line 6, in inner t = foo() ßprint_exc()的输出是这样的，方便定位错误 NameError: global name 'foo' is not defined

除了可以使用timeit的编程接口外，我们也可以在命令行里使用timeit，非常方便： python timeit.py [-n N] [-r N] [-s S] [-t] [-c] [-h] [statement ...] 其中参数的定义如下： -n N/--number=N statement语句执行的次数 -r N/--repeat=N 重复多少次调用timeit()，默认为3 -s S/--setup=S 用以设置statement执行环境的语句，默认为”pass” -t/--time 计时函数，除了Windows平台外默认使用time.time()函数， -c/--clock 计时函数，Windows平台默认使用time.clock()函数 -v/--verbose 输出更大精度的计时数值 -h/--help 简单的使用帮助

推荐文章：

Python调试方法及性能调试