Python笔记

闭包

定义：如果在一个内部函数里，对在外部作用域（但不在全局作用域）的变量进行引用，函数返回内部函数，那么内部函数就被认为是闭包（closure）。

闭包需要满足三个条件：

函数有内部函数
内部函数对在外部作用域（但不在全局作用域）的变量进行引用
函数返回内部函数

示例代码1：

def addx(x):
    def adder(y):
        return x+y
    return adder

c = addx(8)
print(c.__name__, type(c))
print(c(10))
print(c(11))

输出：

adder <class 'function'>   18 19

错误示范2：

def foo(a):
    def bar():
        a += 1
        return a[0]
    return bar

d = foo(1)
print(d(), d(), d(), d())

错误提示：

File "D:/PY_workspace/simple_code/closure.py", line 16, in bar
    a += 1
UnboundLocalError: local variable 'a' referenced before assignment

原因：

python指定所有赋值语句左边的变量都是局部变量，在闭包函数bar()中，变量a被看做了局部变量，接下来执行print时，程序运行至a += 1时，因为先前已经把a看成了局部变量，现在python去bar中寻找赋值语句右边的a的值找不到则会报错。

解决方法为将a设置为一个容器，如下：

def foo():
    a = [1]
    def bar():
        a[0] = a[0] + 1
        return a[0]
    return bar

d = foo()
print(d(), d(), d(), d())

输出：

2 3 4 5

示例代码3：

origin = [0, 0]  # 坐标系统原点
legal_x = [0, 50]  # x轴方向的合法坐标
legal_y = [0, 50]  # y轴方向的合法坐标


def create(pos=origin):
    def player(direction, step):
        # 这里应该首先判断参数direction,step的合法性，比如direction不能斜着走，step不能为负等
        # 然后还要对新生成的x，y坐标的合法性进行判断处理，这里主要是想介绍闭包，就不详细写了。
        new_x = pos[0] + direction[0] * step
        new_y = pos[1] + direction[1] * step
        pos[0] = new_x
        pos[1] = new_y
        # 注意！此处不能写成 pos = [new_x, new_y]，原因在上文有说过
        return pos

    return player


player = create()  # 创建棋子player，起点为原点
print(player.__name__)
print(player([1, 0], 10)) # 向x轴正方向移动10步
print(player([0, 1], 20))  # 向y轴正方向移动20步
print(player([-1, 0], 10))  # 向x轴负方向移动10步

输出：

[10, 0] [10, 20] [0, 20]

闭包的作用：

1 当闭包执行完成后，依旧可以保持住当前的运行环境。

2 闭包可以更加外部作用域的局部变量来得到不同的结果，这点类似于配置功能的作用，可以修改外部的变量，闭包根据这个变量展现出不同的功能。

Python中的闭包

一次检测多个key是否位于dict中

if all(k in request.data for k in ("foo", "bar")):
    print('all in request.data')

父类强制子类继承父类的方法

class BaseSerializer(Field):
    def to_internal_value(self, data):
        raise NotImplementedError('`to_internal_value()` must be implemented.')

    def to_representation(self, instance):
        raise NotImplementedError('`to_representation()` must be implemented.')

    def update(self, instance, validated_data):
        raise NotImplementedError('`update()` must be implemented.')

    def create(self, validated_data):
        raise NotImplementedError('`create()` must be implemented.')

先写一个会报NotImplementedError的默认方法，要是没有重写该方法，使用到该方法就会抛出错误。

平行赋值

《流畅的Python》 P64

city, year, pop, chg, area = ('Tokyo', 2003, 32450, 0.66, 8014)

另外一个很优雅的写法当属不使用中间变量交换两个变量的值：

b, a = a, b

元组拆包

平行赋值是最好辨认的元组拆包形式。《流畅的Python》 P65

>>> lax_coordinates = (33.9425, -118.408056)
>>> latitude, longitude = lax_coordinates # 元组拆包

嵌套元组拆包：

metro_areas = [
('Tokyo','JP',36.933,(35.689722,139.691667)),
]

for name, cc, pop, (latitude, longitude) in metro_areas:
    print(name, cc, pop, latitude, longitude)

具名元组：

Card = collections.namedtuple('Card', ['rank', 'suit'])

切片

以用 s[a:b:c]的形式对 s 在 a 和 b 之间以 c 为间隔取值。 c 的值还可以为负，负值意味着反向取值。 seq[start:stop:step]

>>> s = 'bicycle'
>>> s[::3]
'bye'
>>> s[::-1]
>>> s[::-2]
'eccb'

变量命名

`_` 表示忽略的元素，但当项目需要国际化时，推荐使用`ign(ignore)`代表忽略的元素，因为`python`中常用 `_` 表示翻译函数。

时间、时间间隔操作

pip install python-dateutil

获取变长时间间隔如下周一：

from dateutil.relativedelta import relativedelta
from datetime import datetime
next_monday = datetime.now() + relativedelta(weekday=0)

pip install mysql-python

报错：

Microsoft Visual C++ 14.0 is required

安装C++ Build Tools需要6G的硬盘空间，放弃。可行方法：

easy_install http://www.voidspace.org.uk/python/pycrypto-2.6.1/pycrypto-2.6.1.win32-py2.7.exe

How do I install PyCrypto on Windows?

PIP源

单次切换 pip install django -i https://pypi.douban.com/simple
永久切换在 C:\Users\lenovo 目录下新建 pip 文件夹（lenovo是你主机当前登录的用户名称），文件夹内新建文件 pip.ini，文件内容：

[global]
index-url = https://pypi.douban.com/simple

经典类、新式类

python2默认为经典类，只有显式继承obj才是新式类，其多继承属性的搜索顺序为：先深入继承树左侧，再返回，既深度优先搜索。 python3默认为新式类，多继承属性的搜索顺序为：先水平搜索再向上移动，既广度优先搜索。「方法解析顺序」（Method Resolution Order，或MRO）

groupby

Doc 需要注意一点，在使用groupby方法去为一个list分组时，需要使用与分组用的相同的key先去为list排序。

you are supposed to apply groupby to a list which is already sorted using the same key as groupby itself

Python版：

from itertools import groupby
queryset = self.filter_queryset(self.get_queryset()).order_by('category')
data = []
original_data = sorted(self.get_serializer(queryset, many=True).data, key=lambda x: x['category'])
for k, g in groupby(original_data, key=lambda x: x['category']):
    records = list(g)
    data.append({'category': k, 'records': records})

Django版：

from itertools import groupby
queryset = self.filter_queryset(self.get_queryset()).order_by('category')
data = []
for k, g in groupby(self.get_serializer(queryset, many=True).data, key=lambda x: x['category']):
    records = list(g)
    data.append({'category': k, 'records': records})

读取带中文的json文件至dict

import json
import codecs
with codecs.open('data.json', 'r', 'utf-8') as f:
    data = json.load(f)

excel读取

安装xlrd

pip install xlrd

读取：

import xlrd


def excel_table_by_name(file_path, col_name_index=0, by_name=u'Sheet1'):
    """
    :param work_book:
    :param col_name_index: 表头列名所在行的索引
    :param by_name: 工作区名称
    :return: Excel表格中的数据
    """
    try:
        table = xlrd.open_workbook(file_path).sheet_by_name(by_name)
        rows = table.nrows  # 行数
        col_names = table.row_values(col_name_index)  # 某一行数据
        data = []
        for row_num in range(1, rows):
            row_value = table.row_values(row_num)
            if row_value:
                app = {}
                for i in range(len(col_names)):
                    app[col_names[i]] = row_value[i]
                data.append(app)
        return data
    except Exception as e:
        return None


tables = excel_table_by_name('city_code.xlsx', 0, '大陆省市区乡镇列表')

if tables:
    for row in tables:
        print(row['城市代码'], row['省'], row['市'])

list属于另一个list

set(list_one).issubset(list_two)

sort

数组的sort方法不会返回排序后的数组，仅会对源对象进行排序。

context['fieldsets'].sort(key=itemgetter('title'))

any、all

re正则

分段

import re

data = '2018-05-09 10:17:23,966 [INFO]- "OPTIONS /group/?page_size=0&search=1 HTTP/1.1" 200 0'
m = re.match(r'^(.*)\[(.*)]- (.*)$', data)
print(m.groups())
print(m.group(0))
print(m.group(1), m.group(2), m.group(3))

输出：

('2018-05-09 10:17:23,966 ', 'INFO', '"OPTIONS /group/?page_size=0&search=1 HTTP/1.1" 200 0')
2018-05-09 10:17:23,966 [INFO]- "OPTIONS /group/?page_size=0&search=1 HTTP/1.1" 200 0
2018-05-09 10:17:23,966  INFO "OPTIONS /group/?page_size=0&search=1 HTTP/1.1" 200 0

截取数字

import re
data = 'ab23cc3dd4'
arr = re.split(r'(\d+)', data)
print(arr)

输出：

['ab', '23', 'cc', '3', 'dd', '4', '']

贪婪

（.）贪婪匹配（.?）非贪婪匹配

Python多层for循环嵌套

Python多层for循环嵌套时，内层循环需要跳出到最外层for循环中，使用方法是除了最外层for循环外，所有的内层循环全部移动到函数体中，使用return方法跳出到最外层for循环。

    def xxx():
    for y in ('a', 'b', 'c'):
        for z in ('x', 'y', 'z'):
            if z == 'z':
                return 
            else:
                print('{} {} {}'.format(x, y, z))

    for x in range(1, 3):
        xxx()

    1 a x
    1 a y
    2 a x
    2 a y

list

list方法中，append是添加item,extend是将一个list拼接到当前list上。

sorted函数用法

可以为list或者queryset排序

    sorted_list = sorted(self.queryset, key=lambda x: x.time())
    permission_list = sorted(permission_list, key=lambda x: int(x['content_type'])) 
    # 转换为int类型进行对比是因为按照str类型顺序为 1 11 12... 2 21... 3

修改Decimal显示精度但不改变类型（format方法会将Decimal改变为str）

    '{0:.2f}'.format(amount) # 类型改变为str 但是可以达到改变显示格式的目的

    >>> round(14.22222223, 2)
    14.22

Python中使用常量

constants.py文件：

class InExCategory(object):
    PurchaseCourses = 1
    BalanceRecharge = 2
    ClassHourExpense = 3
    AdmissionDeduct = 4
    RecommendDeduct = 5
    BalanceWithdraw = 6
    ReturnClassHour = 7

或者直接

PurchaseCourses = 1
BalanceRecharge = 2
ClassHourExpense = 3
AdmissionDeduct = 4
RecommendDeduct = 5
BalanceWithdraw = 6
ReturnClassHour = 7

使用类是为了归纳区分常量类型

元类`mateclass`理解

StackOverFlow

两个list取差

方法1：

    s = set(temp2)
    temp3 = [x for x in temp1 if x not in s]

方法2：

    list(set(temp1) - set(temp2))

Get difference between two lists

代码保护

使用__pycache__文件夹下对应的pyc文件重命名(views.cpython-35.pyc -> views.pyc)后替代原先的py文件即可。

python -m py_compile /path/to/需要生成.pyc的脚本.py #若批量处理.py文件
#则替换为/path/to/{需要生成.pyc的脚本1,脚本2,...}.py
#或者/path/to/

import py_compile
py_compile.compile(r'/path/to/需要生成.pyc的脚本.py') #同样也可以是包含.py文件的目录路径
#此处尽可能使用raw字符串，从而避免转义的麻烦。比如，这里不加“r”的话，你就得对斜杠进行转义

修改`migrate`记录

delete from tisanems.django_migrations 
where id=19
limit 1

`mysql`新增`unique=True`错误

Q:`django.db.utils.OperationalError: (1061, "Duplicate key name 'storage_allotapply_no_eb237ab5_uniq'")`

A:`mysql> DROP INDEX [index_name] ON [table_name];`既

    use dbname;
    DROP INDEX storage_allotapply_no_eb237ab5_uniq ON storage_allotapply;
    # 要一次DROP完所有的

Django database migration error: duplicate key

KeyError获取哪个key错误

try:

except KeyError as e:
    return error_response(1, '获取参数{}失败'.format(e.args[0]))

Glossary 术语表

BDFL Benevolent Dictator For Life 的简称，意为“仁慈的独裁者”。
CRUD Create、 Read、 Update、 Delete 的首字母缩写，这是存储记录的应用程序中的四种基本操作。
DRY Don't Repeat Yourself（不要自我重复）的缩写，一种软件工程原则。
dunder 首尾有两条下划线的特殊方法和属性的简洁读法（即把 len 读成“dunder len”）。
EAFP “it's easier to ask forgiveness than permission”（取得原谅比获得许可容易）的首字母缩写。人们认为这句话是计算机先驱 Grace Hopper 说的， Python程序员使用这个缩写指代一种动态编程方式，例如访问属性前不测试有没有属性，如果没有就捕获异常。 hasattr 函数的文档字符串是这样描述它的工作方式的： “调用 getattr(object, name)，然后捕获 AttributeError 异常。
LBYL 三思而后行（look before you leap）。这种编程风格在调用函数或查找属性或键之前显式测试前提条件。与 EAFP 风格相反，这种风格的特点是代码中有很多 if 语句。在多线程环境中， LBYL风格可能会在“检查”和“行事”的空当引入条件竞争。例如，对 if key in mapping: return mapping[key] 这段代码来说，如果在测试之后，但在查找之前，另一个线程从映射中删除了那个键，那么这段代码就会失败。这个问题可以使用锁或者 EAFP 风格解决。
ORM Object-Relational Mapper（对象关系映射器）的缩写，通过这种 API 可以使用 Python 类和对象访问数据库中的表和记录，而且调用方法可以执行数据库操作。 SQLAlchemy是流行的独立 Python ORM， Django 和 Web2py自带了 ORM。
PyPI Python 包索引（https://pypi.python.org/），里面有超过 60 000 个包可用。也叫奶酪店（参见奶酪店词条）。为了防止与 PyPy 混淆， PyPI 应该读作“pie-P-eye”。
YAGNI You Ain't Gonna Need It（你不需要这个）的首字母缩写，这个口号的意思是，根据对未来需求的预测，不要实现非立即需要的功能。
monkey patching 猴子补丁在运行时动态修改模块、类或函数，通常是添加功能或修正缺陷。猴子补丁在内存中发挥作用，不会修改源码，因此只对当前运行的程序实例有效。因为猴子补丁破坏了封装，而且容易导致程序与补丁代码的实现细节紧密耦合，所以被视为临时的变通方案，不是集成代码的推荐方式
mixin method 混入方法抽象基类或混入类中方法的具体实现。
mixin class 混入类用于随着多重继承类树中的一个或多个类一起扩展的类。混入类绝不能实例化，它的具体子类也应该是其他非混入类的子类。
duck typing 鸭子类型多态的一种形式，在这种形式中，不管对象属于哪个类，也不管声明的具体接口是什么，只要对象实现了相应的方法，函数就可以在对象上执行操作。
decorator 装饰器一个可调用的对象 A，返回另一个可调用的对象 B，在可调用的对象 C 的定义体之前使用句法 @A 调用。 Python 解释器读取这样的代码时，会调用 A(C)，把返回的 B 绑定给之前赋予 C 的变量，也就是把 C 的定义体换成 B。如果目标可调用对象 C 是函数，那么 A 是函数装饰器；如果 C 是类，那么 A 是类装饰器。
IDLE An Integrated Development Environment for Python. IDLE is a basic editor and interpreter environment which ships with the standard distribution of Python.
Zen of Python Listing of Python design principles and philo

sophies that are helpful in understanding and using the language. The listing can be found by typing “import this” at the interactive prompt.

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!