Python Revisited Day 09 (调试、测试与Profiling)

9.1 调试
- 9.1.1 处理语法错误
9.1.2 处理运行时错误
9.1.3 科学的调试
9.2 单元测试
9.3 Profiling

9.1 调试

定期地进行备份是程序设计中地一个关键环节——不管我们的机器，操作系统多么可靠以及发生失败的概率多么微乎其微——因为失败仍然是可能发生的。备份一般都是粗粒度的——备份文件是几小时之前的，甚至是几天之前的。

9.1.1 处理语法错误



if True

    print("stupid!!!")

else:

    print("You will never see me...")

  File "C:/Py/modeltest.py", line 5

    if True

          ^

SyntaxError: invalid syntax

上面的例子中，if后面忘记加了“：”，所以报错。



try:

    s = "Tomorrow is a new day, {0}"

    s2 = "gone with the wind..."

    print(s.format(s2)

except ValueError as err:

    print(err)

  File "C:/Py/modeltest.py", line 10

    except ValueError as err:

         ^

SyntaxError: invalid syntax

看上面的例子，实际上，报错的位置并没有错误，真正的错误在于print后少了半边括号，但是Python在运行到此处的时候并没有意识到错误, 因为可能通过括号分行，所以显示错误在了下一行。

9.1.2 处理运行时错误

pass

9.1.3 科学的调试

如果程序可以运行，但程序行为和期待的或需要的不一致，就说明程序中存在一个bug——必须清除的逻辑错误。清楚这类错误的最好方法是首先使用TDD(测试驱动的开发)来防止发生这一类错误，然而，总会有些bug没有避免，因此，即便使用TDD，调试也仍然是必须学习和掌握的技能。

为清楚一个bug, 我们必须采取如下一个步骤：

再现bug
定位bug
修复bug
对修复进行测试

Pycharm Debug调试心得-放下扳手&拿起键盘

9.2 单元测试

单元测试——对单独的函数、类与方法进行测试，确保其符合预期的行为。

就像我们之前那样做的：

if __name__ == "__main__":

	import doctest

	doctest.testmod()

另一种执行doctest的方法是使用uniitest模块创建单独的测试程序。unittest模块可以基于doctests创建测试用例，而不需要指导程序或模块包含的任何事物——只要指导其包含doctest即可。

我们创建了一个docunit.py的程序：



def test(x):

    """

    >>> test(-1)

    'hahahaha'

    >>> test(1)

    'lalalala'

    >>> test('1')

    'wuwuwuwuwuwu'

    """

    s1 = "hahahahha"

    s2 = "lalalalala"

    s3 = "wuwuwuwuwuwu"

    try:

        if x <= 0:

            return s1

        else:

            return s2

    except:

        return s3

注意，如果运行测试，前俩条会出错，因为不匹配。

再创建一个新的程序：



import doctest

import unittest

import docunit

suite = unittest.TestSuite()

suite.addTest(doctest.DocTestSuite(docunit))

runner = unittest.TextTestRunner()

print(runner.run(suite))

注意，第三个import的是自己的程序，输出为：

<unittest.runner.TextTestResult run=1 errors=0 failures=1>

F

======================================================================

FAIL: test (docunit)

Doctest: docunit.test

----------------------------------------------------------------------

Traceback (most recent call last):

  File "C:\Ana\lib\doctest.py", line 2198, in runTest

    raise self.failureException(self.format_failure(new.getvalue()))

AssertionError: Failed doctest test for docunit.test

  File "C:\Py\docunit.py", line 7, in test

----------------------------------------------------------------------

File "C:\Py\docunit.py", line 9, in docunit.test

Failed example:

    test(-1)

Expected:

    'hahahaha'

Got:

    'hahahahha'

----------------------------------------------------------------------

File "C:\Py\docunit.py", line 11, in docunit.test

Failed example:

    test(1)

Expected:

    'lalalala'

Got:

    'lalalalala'

----------------------------------------------------------------------

Ran 1 test in 0.000s

FAILED (failures=1)

Process finished with exit code 0

只是，这个时候，我们写的程序的程序名，必须为有效的模块名。

unittest 模块定义了4个关键概念。测试夹具是一个用于描述创建测试（以及用完之后将其清理）所必须的代码的术语，典型实例是创建测试所用的一个输入文件，最后删除输入文件与结果输出文件。测试套件是一组测试用例的组合。测试用例是测试的基本单元。测试运行着是执行一个或多个测试套件的对象。

典型情况下，测试套件是通过创建unittest.TestCase的子类实现的，其中每个名称以“test”开头的方法都是一个测试用例。如果我们需要完成任何创建操作，就可以在一个名为setUp()的方法中实现；类似地，对任何清理操作，也可以实现一个名为tearDown()的方法。在测试内部，有大量可供我们使用的unittest.TestCase方法，包括assertTrue(), assertEqual()， assertAlmostEqual()（对于测试浮点数很有用）、assertRaises()以及更多，还包括对应的逆方法，比如assertFalse(), assertNotEqual()、faillfEqual()、failUnlessEqual()等。

下面是一个例子，因为不知道该编一个啥，就用一个最简单的，只是为了说明这个unittest该怎么用。



import unittest

class List(list):

    def plus(self, other):

        return list(set(self + other))

class TestList(unittest.TestCase):

    def setUp(self):

        self.list1 = List(range(3))

        self.list2 = list(range(2, 5))

    def test_list_add(self):

        addlist = self.list1 + self.list2

        self.assertEqual(

            addlist, [0, 1, 2, 2, 3, 4]

        )

    def test_list_plus(self):

        pluslist = self.list1.plus(self.list2)

        self.assertNotEqual(

            pluslist, [0, 1, 2, 2, 3, 4]

        )

        def process():

            self.list2.plus(self.list1)

        self.assertRaises(

            AttributeError, process   #注意assertRaises的第二项必须callable Obj

        )

    def tearDown(self):

        """

        我不知道这么做有没有用

        :return:

        """

        del self

if __name__ == "__main__":

    suite = unittest.TestLoader().loadTestsFromTestCase(

        TestList

    )

    runner = unittest.TextTestRunner()

    print(runner.run(suite))

更多的函数，在博客,还蛮详细的：

python的unittest单元测试框架断言整理汇总-黑面狐

9.3 Profiling

一些合理的Python程序设计风格，对提高程序性能不无裨益：

在需要只读序列是，最好使用元组而非列表；
使用生成器，而不是创建大的元组和列表并在其上进行迭代处理
尽量使用Python内置的数据结构——dicts, list, tuples——而不实现自己的自定义结构
从小字符串中产生大字符串时，不要对小字符串进行连接，而是在列表中累积，最后将字符串列表结合为一个单独的字符串
最后一点，如果某个对象需要多次使用属性进行访问，或从某个数据结构中进行访问，那么较好的做法时创建并使用一个局部变量来访问该对象。

在jupiter notebook里面用%%time输出cell单次运行的时间，%%timeit 输出运行10万次?的平均之间.

使用timeit模块：

import timeit

def function_a(x, y):

    for i in range(10000):

        x + y

def function_b(x, y):

    for i in range(10000):

        x * y

def function_c(x, y):

    for i in range(10000):

        x / y

if __name__ == "__main__":

    repeats = 1000

    X = 123.123

    Y = 43.432

    for function in ("function_a", "function_b",

                     "function_c"):

        t = timeit.Timer("{0}(X, Y)".format(function),

                         "from __main__ import {0}, X, Y".format(function))

        sec = t.timeit(repeats) / repeats

        print("{function}() {sec:.6f} sec".format(**locals()))

其中timeit.Timer()函数的第一个参数，是我们需要执行的字符串，第二个参数也是可执行的字符串，是用以提供参数的。

function_a() 0.000386 sec

function_b() 0.000384 sec

function_c() 0.000392 sec

利用cProfile模块，会更加方便且详细地给出运行时间地指示：

import cProfile

import time

def function_a(x, y):

    for i in range(10000):

        function_f(x, y)

    function_d()

def function_b(x, y):

    for i in range(10000):

        function_f(x, y)

    function_d()

    function_d()

def function_c(x, y):

    for i in range(10000):

        function_f(x, y)

    function_d()

    function_d()

    function_d()

def function_d():

    time.sleep(0.01)

def function_f(x, y):

    x * y

if __name__ == "__main__":

    repeats = 1000

    X = 123.123

    Y = 43.432

    for function in ("function_a", "function_b",

                     "function_c"):

        cProfile.run("for i in range(1000): {0}(X, Y)"

                     .format(function))

         10003003 function calls in 16.040 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)

        1    0.007    0.007   16.040   16.040 <string>:1(<module>)

     1000    3.878    0.004   16.033    0.016 modeltest.py:13(function_a)

     1000    0.006    0.000   10.241    0.010 modeltest.py:31(function_d)

 10000000    1.915    0.000    1.915    0.000 modeltest.py:34(function_f)

        1    0.000    0.000   16.040   16.040 {built-in method builtins.exec}

     1000   10.235    0.010   10.235    0.010 {built-in method time.sleep}

        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

         10005003 function calls in 28.183 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)

        1    0.008    0.008   28.183   28.183 <string>:1(<module>)

     1000    4.873    0.005   28.175    0.028 modeltest.py:18(function_b)

     2000    0.015    0.000   20.903    0.010 modeltest.py:31(function_d)

 10000000    2.399    0.000    2.399    0.000 modeltest.py:34(function_f)

        1    0.000    0.000   28.183   28.183 {built-in method builtins.exec}

     2000   20.887    0.010   20.887    0.010 {built-in method time.sleep}

        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

         10007003 function calls in 38.968 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)

        1    0.008    0.008   38.968   38.968 <string>:1(<module>)

     1000    5.004    0.005   38.959    0.039 modeltest.py:24(function_c)

     3000    0.024    0.000   31.498    0.010 modeltest.py:31(function_d)

 10000000    2.457    0.000    2.457    0.000 modeltest.py:34(function_f)

        1    0.000    0.000   38.968   38.968 {built-in method builtins.exec}

     3000   31.474    0.010   31.474    0.010 {built-in method time.sleep}

        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

ncalls: 调用地次数

tottime: 在某个函数中耗费的总时间，但是派出了函数调用的其他函数内部花费的时间

percall: 对函数的每次调用的平均时间 tottime / ncalls

cumtime: 累计时间，列出了在函数中耗费的时间，并且包含了函数调用其他函数内部花费的时间

percall（第二个）: 列出了对函数的每次调用的平均时间，包裹其调用的函数耗费的时间

秒客网