安装：

pip3 install 模块名称

导入：

import module
from module.xx.xx import xx
from module.xx.xx import xx as rename 
from module.xx.xx import *

二 random

random.random

random.random()用于生成一个0到1的随机符点数: 0 <= n < 1.0

random.randint

用于生成一个指定范围内的整数

random.randrange

从指定范围内，按指定基数递增的集合中获取一个随机数。如：random.randrange(10, 100, 2)，结果相当于从[10, 12, 14, 16, ... 96, 98]序列中获取一个随机数。random.randrange(10, 100, 2)在结果上与 random.choice(range(10, 100, 2) 等效。

random.shuffle

用于将一个列表中的元素打乱。

三序列化

json 用于【字符串】和【python基本数据类型】间进行转换（用于多种语言）
pickle 用于【python特有的类型】和【python基本数据类型】间进行转换（只能用于python）

Json模块提供了四个功能：dumps、dump、loads、load

pickle模块提供了四个功能(pickle要用二进制模式写入，读取)：dumps、dump、loads、load

import pickle
data = {'k1': 123, 'k2': 456}
#pickle.dumps 把数据通过特殊形式转为字符串
p = pickle.dumps(data)
print (p)

#pickle.dump 把数据通过特殊形式转为字符串,并写入文件
with open('D:/result.pk'，‘wb’) as f:
    pickle.dump(data, f)

import json
data = '{“k1”: 123, “k2”: 456}' #字符串一定是里面双引号，外面单引号
#json.loads 把字符串转为基本数据类型
p = json.loads(data)
print (p)
#读取文件，pickle.load 把字符串转为基本数据类型
p =  json.load(open(D:/db, 'r))
print (p)

四 time & datetime

print time.time()  #返回时间戳
print time.mktime(time.localtime()) #转成时间戳   
print time.gmtime()    #可加时间戳参数
print time.localtime() #可加时间戳参数
print time.strptime('2014-11-11', '%Y-%m-%d') #将字符床转成struct_time格式格式
print time.strftime('%Y-%m-%d') #默认当前时间
print time.ctime() #当前时间

e.g.：把字符串变时间戳

1 tm = time.strptime('2016-11-8', '%Y-%m-%d')
2 print(time.mktime(tm))

import datetime
current_time = datetime.datetime.now()
print (current_time)#当前时间，格式为输出2016-11-08 14:42:20.335935（用的较多）
print (current_time.timetuple()) # 返回struct_time格式
print(current_time.replace(2014,9,12)) #当前时间，格式为输出2014-9-12 14:42:20.335935,输出时间年月日被替代，时间与当前时间一样
print (datetime.datetime.now() - datetime.timedelta(days=5)) #比现在加10天
print (datetime.datetime.now() - datetime.timedelta(hours=-5)  #比现在早5小时

%Y  Year with century as a decimal number.
    %m  Month as a decimal number [01,12].
    %d  Day of the month as a decimal number [01,31].
    %H  Hour (24-hour clock) as a decimal number [00,23].
    %M  Minute as a decimal number [00,59].
    %S  Second as a decimal number [00,61].
    %z  Time zone offset from UTC.
    %a  Locale's abbreviated weekday name.
    %A  Locale's full weekday name.
    %b  Locale's abbreviated month name.
    %B  Locale's full month name.
    %c  Locale's appropriate date and time representation.
    %I  Hour (12-hour clock) as a decimal number [01,12].
    %p  Locale's equivalent of either AM or PM.

占位符

五 logging

日志级别分别代表什么意思

Level	When it’s used
`DEBUG`	Detailed information, typically of interest only when diagnosing problems.
`INFO`	Confirmation that things are working as expected.
`WARNING`	An indication that something unexpected happened, or indicative of some problem in the near future (e.g. ‘disk space low’). The software is still working as expected.
`ERROR`	Due to a more serious problem, the software has not been able to perform some function.
`CRITICAL`	A serious error, indicating that the program itself may be unable to continue running.

import logging
 
logging.basicConfig(filename='example.log',level=logging.INFO)
logging.debug('This message should go to the log file')
logging.info('So should this')
logging.warning('And this, too')

其中下面这句中的level=loggin.INFO意思是，把日志纪录级别设置为INFO，也就是说，只有比日志是INFO或比INFO级别更高的日志才会被纪录到文件里，在这个例子，第一条日志是不会被纪录的，如果希望纪录debug的日志，那把日志级别改成DEBUG就行了。

加上时间

import logging
logging.basicConfig(format='%(asctime)s %(message)s', datefmt='%m/%d/%Y %I:%M:%S %p')
logging.warning('is when this event was logged.')
 
#输出
12/12/2010 11:46:36 AM is when this event was logged.

日志格式

%(name)s	Logger的名字
%(levelno)s	数字形式的日志级别
%(levelname)s	文本形式的日志级别
%(pathname)s	调用日志输出函数的模块的完整路径名，可能没有
%(filename)s	调用日志输出函数的模块的文件名
%(module)s	调用日志输出函数的模块名
%(funcName)s	调用日志输出函数的函数名
%(lineno)d	调用日志输出函数的语句所在的代码行
%(created)f	当前时间，用UNIX标准的表示时间的浮点数表示
%(relativeCreated)d	输出日志信息时的，自Logger创建以来的毫秒数
%(asctime)s	字符串形式的当前时间。默认格式是 “2003-07-08 16:49:45,896”。逗号后面的是毫秒
%(thread)d	线程ID。可能没有
%(threadName)s	线程名。可能没有
%(process)d	进程ID。可能没有
%(message)s	用户输出的消息

只有【当前写等级】大于【日志等级】时，日志文件才被记录。

日志记录格式：

学习PYTHON之路， DAY 6
一安装，导入模块
三序列化
四 time & datetime
六模块中的特殊变量
七 sys
八 os
九 hashlib
十 re
十一 ConfigParser
十二 xml
十三 shutil

重点： %(lineno)d, 行数； %(module)s，模块名； %(process)d，进程

logging模块记录日志涉及四个主要类

logger提供了应用程序可以直接使用的接口；

handler将(logger创建的)日志记录发送到合适的目的输出；

filter提供了细度设备来决定输出哪条日志记录；（少用）

formatter决定日志记录的最终输出格式。

把log打印在屏幕和文件日志里

 1 import logging
 2  
 3 #create logger
 4 logger = logging.getLogger('TEST-LOG') #先获取logger
 5 logger.setLevel(logging.DEBUG) #全局日志级别
 6  
 7  
 8 # create console handler and set level to debug
 9 ch = logging.StreamHandler() #在屏幕中输出
10 ch.setLevel(logging.DEBUG)
11  
12 # create file handler and set level to warning
13 fh = logging.FileHandler("access.log") #在文件中输出
14 fh.setLevel(logging.WARNING)
15 # create formatter
16 formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
17  
18 # add formatter to ch and fh
19 ch.setFormatter(formatter) 
20 fh.setFormatter(formatter)
21  
22 # add ch and fh to logger
23 logger.addHandler(ch)  #把logger打印到指定位置
24 logger.addHandler(fh)
25  
26 # 'application' code
27 logger.debug('debug message')
28 logger.info('info message')
29 logger.warn('warn message')
30 logger.error('error message')
31 logger.critical('critical message')

code

六模块中的特殊变量

__doc__ 获取文件中的注析

__file__ 获取文件的路径

学习PYTHON之路， DAY 6
一安装，导入模块
三序列化
四 time & datetime
六模块中的特殊变量
七 sys
八 os
九 hashlib
十 re
十一 ConfigParser
十二 xml
十三 shutil

__name__ 直有执行当前文件时候，当前文件的特殊变量 __name__ == '__main__'

1 #只有在主文件才执行，导入文件不执行
2 def run():
3     print('run')
4 
5 if __name__ == '__main__'
6     run()

七 sys

1 sys.argv           命令行参数List，第一个元素是程序本身路径
2 sys.exit(n)        退出程序，正常退出时exit(0)
3 sys.version        获取Python解释程序的版本信息
4 sys.maxint         最大的Int值
5 sys.path           返回模块的搜索路径，初始化时使用PYTHONPATH环境变量的值
6 sys.platform       返回操作系统平台名称
7 sys.stdout.write('please:')
8 val = sys.stdin.readline()[:-1]

 1 import sys
 2 import time
 3 
 4 
 5 def view_bar(num, total):
 6     rate = float(num) / float(total)
 7     rate_num = int(rate * 100)
 8     r = '
%d%%' % (rate_num, ) #
 回到到开头
 9     sys.stdout.write(r)
10     sys.stdout.flush()  #删除记录
11 
12 
13 if __name__ == '__main__':
14     for i in range(0, 100):
15         time.sleep(0.1)
16         view_bar(i, 100)

进度百分比

八 os

 1 os.getcwd()                 获取当前工作目录，即当前python脚本工作的目录路径
 2 os.chdir("dirname")         改变当前脚本工作目录；相当于shell下cd
 3 os.curdir                   返回当前目录: ('.')
 4 os.pardir                   获取当前目录的父目录字符串名：('..')
 5 os.makedirs('dir1/dir2')    可生成多层递归目录
 6 os.removedirs('dirname1')   若目录为空，则删除，并递归到上一级目录，如若也为空，则删除，依此类推
 7 os.mkdir('dirname')         生成单级目录；相当于shell中mkdir dirname
 8 os.rmdir('dirname')         删除单级空目录，若目录不为空则无法删除，报错；相当于shell中rmdir dirname
 9 os.listdir('dirname')       列出指定目录下的所有文件和子目录，包括隐藏文件，并以列表方式打印
10 os.remove()                 删除一个文件
11 os.rename("oldname","new")  重命名文件/目录
12 os.stat('path/filename')    获取文件/目录信息
13 os.sep                      操作系统特定的路径分隔符，win下为"\",Linux下为"/"
14 os.linesep                  当前平台使用的行终止符，win下为"	
",Linux下为"
"
15 os.pathsep                  用于分割文件路径的字符串
16 os.name                     字符串指示当前使用平台。win->'nt'; Linux->'posix'
17 os.system("bash command")   运行shell命令，直接显示
18 os.environ                  获取系统环境变量
19 os.path.abspath(path)       返回path规范化的绝对路径
20 os.path.split(path)         将path分割成目录和文件名二元组返回
21 os.path.dirname(path)       返回path的目录。其实就是os.path.split(path)的第一个元素
22 os.path.basename(path)      返回path最后的文件名。如何path以／或结尾，那么就会返回空值。即os.path.split(path)的第二个元素
23 os.path.exists(path)        如果path存在，返回True；如果path不存在，返回False
24 os.path.isabs(path)         如果path是绝对路径，返回True
25 os.path.isfile(path)        如果path是一个存在的文件，返回True。否则返回False
26 os.path.isdir(path)         如果path是一个存在的目录，则返回True。否则返回False
27 os.path.join(path1[, path2[, ...]])  将多个路径组合后返回，第一个绝对路径之前的参数将被忽略
28 os.path.getatime(path)      返回path所指向的文件或者目录的最后存取时间
29 os.path.getmtime(path)      返回path所指向的文件或者目录的最后修改时间

重点

学习PYTHON之路， DAY 6
一安装，导入模块
三序列化
四 time & datetime
六模块中的特殊变量
七 sys
八 os
九 hashlib
十 re
十一 ConfigParser
十二 xml
十三 shutil

`九 hashlib`

用于加密相关的操作

 1 import hashlib
 2  
 3 # ######## md5 ########
 4 hash = hashlib.md5()
 5 # help(hash.update)
 6 hash.update(bytes('admin', encoding='utf-8'))
 7 print(hash.hexdigest())
 8 print(hash.digest())
 9  
10  
11 ######## sha1 ########
12  
13 hash = hashlib.sha1()
14 hash.update(bytes('admin', encoding='utf-8'))
15 print(hash.hexdigest())
16  
17 # ######## sha256 ########
18  
19 hash = hashlib.sha256()
20 hash.update(bytes('admin', encoding='utf-8'))
21 print(hash.hexdigest())
22  
23  
24 # ######## sha384 ########
25  
26 hash = hashlib.sha384()
27 hash.update(bytes('admin', encoding='utf-8'))
28 print(hash.hexdigest())
29  
30 # ######## sha512 ########
31  
32 hash = hashlib.sha512()
33 hash.update(bytes('admin', encoding='utf-8'))
34 print(hash.hexdigest())

对加密算法中添加自定义key再来做加密

1 import hashlib
2  
3 # ######## md5 ########
4  
5 hash = hashlib.md5(bytes('898oaFs09f',encoding="utf-8"))
6 hash.update(bytes('admin',encoding="utf-8"))
7 print(hash.hexdigest())

python内置还有一个 hmac 模块，它内部对我们创建 key 和内容进行进一步的处理然后再加密

1 import hmac
2  
3 h = hmac.new(bytes('898oaFs09f',encoding="utf-8"))
4 h.update(bytes('admin',encoding="utf-8"))
5 print(h.hexdigest())

十 re

学习PYTHON之路， DAY 6
一安装，导入模块
三序列化
四 time & datetime
六模块中的特殊变量
七 sys
八 os
九 hashlib
十 re
十一 ConfigParser
十二 xml
十三 shutil

match

# match，从起始位置开始匹配，匹配成功返回一个对象，未匹配成功返回None match(pattern, string, flags=0)

# pattern：正则模型

# string ：要匹配的字符串

# falgs ：匹配模式

re.I(全拼：IGNORECASE): 忽略大小写（括号内是完整写法，下同）
re.M(全拼：MULTILINE): 多行模式，改变'^'和'$'的行为（参见上图）
re.S(全拼：DOTALL): 点任意匹配模式，改变'.'的行为
re.L(全拼：LOCALE): 使预定字符类 w W B s S 取决于当前区域设定
re.U(全拼：UNICODE): 使预定字符类 w W B s S d D 取决于unicode定义的字符属性
re.X(全拼：VERBOSE): 详细模式。这个模式下正则表达式可以是多行，忽略空白字符，并可以加入注释。

 1 # 无分组
 2 import re
 3 origin = 'had adfasdf '
 4 r = re.match("(hw+)", origin)
 5 print(r.group())     # 获取匹配到的所有结果
 6 print(r.groups())    # 获取模型中匹配到的分组结果
 7 print(r.groupdict()) # 获取模型中匹配到的分组结果
 8 
 9 # 有分组
10 # 为何要有分组？提取匹配成功的指定内容（先匹配成功全部正则，再匹配成功的局部内容提取出来）
11 
12 r = re.match("h(w+).*(?P<name>d)$", origin)
13 print(r.group())     # 获取匹配到的所有结果
14 print(r.groups())    # 获取模型中匹配到的分组结果
15 print(r.groupdict()) # 获取模型中匹配到的分组中所有执行了key的组

search

# search,浏览整个字符串去匹配第一个，未匹配成功返回None

# search(pattern, string, flags=0)

findall

# findall，获取非重复的匹配列表；如果有一个组则以列表形式返回，且每一个匹配均是字符串；如果模型中有多个组，则以列表形式返回，且每一个匹配均是元祖；

# 空的匹配也会包含在结果中

#findall(pattern, string, flags=0)

sub

# sub，替换匹配成功的指定位置字符串

sub(pattern, repl, string, count=0, flags=0)

# pattern：正则模型

# repl ：要替换的字符串或可执行对象

# string ：要匹配的字符串

# count ：指定匹配个数

# flags ：匹配模式

split

# split，根据正则匹配分割字符串

split(pattern, string, maxsplit=0, flags=0)

# pattern：正则模型

# string ：要匹配的字符串

# maxsplit：指定分割个数

# flags ：匹配模式

IP：
^(25[0-5]|2[0-4]d|[0-1]?d?d)(.(25[0-5]|2[0-4]d|[0-1]?d?d)){3}$
手机号：
^1[3|4|5|8][0-9]d{8}$
邮箱：
[a-zA-Z0-9_-]+@[a-zA-Z0-9_-]+(.[a-zA-Z0-9_-]+)+

常用正则

十一 ConfigParser

用于生成和修改常见配置文档（格式如下），当前模块的名称在 python 3.x 版本中变更为 configparser。

 1 [DEFAULT]
 2 ServerAliveInterval = 45
 3 Compression = yes
 4 CompressionLevel = 9
 5 ForwardX11 = yes
 6  
 7 [bitbucket.org]
 8 User = hg
 9  
10 [topsecret.server.com]
11 Port = 50022
12 ForwardX11 = no

用python生成，如下：

 1 import configparser
 2  
 3 config = configparser.ConfigParser()
 4 config["DEFAULT"] = {'ServerAliveInterval': '45',
 5                       'Compression': 'yes',
 6                      'CompressionLevel': '9'}
 7  
 8 config['bitbucket.org'] = {}
 9 config['bitbucket.org']['User'] = 'hg'
10 config['topsecret.server.com'] = {}
11 topsecret = config['topsecret.server.com']
12 topsecret['Host Port'] = '50022'     # mutates the parser
13 topsecret['ForwardX11'] = 'no'  # same here
14 config['DEFAULT']['ForwardX11'] = 'yes'
15 with open('example.ini', 'w') as configfile:
16    config.write(configfile)

用python输出，如下：

 1 >>> import configparser
 2 >>> config = configparser.ConfigParser()
 3 >>> config.sections()
 4 []
 5 >>> config.read('example.ini')
 6 ['example.ini']
 7 >>> config.sections()
 8 ['bitbucket.org', 'topsecret.server.com']
 9 >>> 'bitbucket.org' in config
10 True
11 >>> 'bytebong.com' in config
12 False
13 >>> config['bitbucket.org']['User']
14 'hg'
15 >>> config['DEFAULT']['Compression']
16 'yes'
17 >>> topsecret = config['topsecret.server.com']
18 >>> topsecret['ForwardX11']
19 'no'
20 >>> topsecret['Port']
21 '50022'
22 >>> for key in config['bitbucket.org']: print(key)
23 ...
24 user
25 compressionlevel
26 serveraliveinterval
27 compression
28 forwardx11
29 >>> config['bitbucket.org']['ForwardX11']
30 'yes'

View Code

获取所有节点

1 import configparser
2  
3 config = configparser.ConfigParser()
4 config.read('xxxooo', encoding='utf-8')
5 ret = config.sections()
6 print(ret)

获取指定节点下所有的键值对

1 import configparser
2  
3 config = configparser.ConfigParser()
4 config.read('xxxooo', encoding='utf-8')
5 ret = config.items('section1')
6 print(ret)

获取指定节点下所有的建

import configparser
 
config = configparser.ConfigParser()
config.read('xxxooo', encoding='utf-8')
ret = config.options('section1')
print(ret)

获取指定节点下指定key的值

 1 import configparser
 2  
 3 config = configparser.ConfigParser()
 4 config.read('xxxooo', encoding='utf-8')
 5  
 6  
 7 v = config.get('section1', 'k1')
 8 # v = config.getint('section1', 'k1')
 9 # v = config.getfloat('section1', 'k1')
10 # v = config.getboolean('section1', 'k1')
11  
12 print(v)

检查、删除、添加节点

 1 import configparser
 2  
 3 config = configparser.ConfigParser()
 4 config.read('xxxooo', encoding='utf-8')
 5  
 6  
 7 # 检查
 8 has_sec = config.has_section('section1')
 9 print(has_sec)
10  
11 # 添加节点
12 config.add_section("SEC_1")
13 config.write(open('xxxooo', 'w'))
14  
15 # 删除节点
16 config.remove_section("SEC_1")
17 config.write(open('xxxooo', 'w'))

检查、删除、设置指定组内的键值对

 1 import configparser
 2  
 3 config = configparser.ConfigParser()
 4 config.read('xxxooo', encoding='utf-8')
 5  
 6 # 检查
 7 has_opt = config.has_option('section1', 'k1')
 8 print(has_opt)
 9  
10 # 删除
11 config.remove_option('section1', 'k1')
12 config.write(open('xxxooo', 'w'))
13  
14 # 设置
15 config.set('section1', 'k10', "123")
16 config.write(open('xxxooo', 'w'))

十二 xml

1、解析XML

利用ElementTree.XML将字符串解析为xml对象

1 from xml.etree import ElementTree as ET
2 
3 
4 # 打开文件，读取XML内容
5 str_xml = open('xo.xml', 'r').read()
6 
7 # 将字符串解析成xml特殊对象，root代指xml文件的根节点
8 root = ET.XML(str_xml)

利用ElementTree.parse将文件直接解析为xml对象

1 from xml.etree import ElementTree as ET
2 
3 # 直接解析xml文件
4 tree = ET.parse("xo.xml")
5 
6 # 获取xml文件的根节点
7 root = tree.getroot()

2、操作XML

a. 遍历XML文档的所有内容

from xml.etree import ElementTree as ET

############ 解析方式一 ############
"""
# 打开文件，读取XML内容
str_xml = open('xo.xml', 'r').read()

# 将字符串解析成xml特殊对象，root代指xml文件的根节点
root = ET.XML(str_xml)
"""
############ 解析方式二 ############

# 直接解析xml文件
tree = ET.parse("xo.xml")

# 获取xml文件的根节点
root = tree.getroot()


### 操作

# 顶层标签
print(root.tag)


# 遍历XML文档的第二层
for child in root:
    # 第二层节点的标签名称和标签属性
    print(child.tag, child.attrib)
    # 遍历XML文档的第三层
    for i in child:
        # 第二层节点的标签名称和内容
        print(i.tag,i.text)

View Code

b、遍历XML中指定的节点

 1 from xml.etree import ElementTree as ET
 2 
 3 ############ 解析方式一 ############
 4 """
 5 # 打开文件，读取XML内容
 6 str_xml = open('xo.xml', 'r').read()
 7 
 8 # 将字符串解析成xml特殊对象，root代指xml文件的根节点
 9 root = ET.XML(str_xml)
10 """
11 ############ 解析方式二 ############
12 
13 # 直接解析xml文件
14 tree = ET.parse("xo.xml")
15 
16 # 获取xml文件的根节点
17 root = tree.getroot()
18 
19 
20 ### 操作
21 
22 # 顶层标签
23 print(root.tag)
24 
25 
26 # 遍历XML中所有的year节点
27 for node in root.iter('year'):
28     # 节点的标签名称和内容
29     print(node.tag, node.text)

View Code

c、修改节点内容

由于修改的节点时，均是在内存中进行，其不会影响文件中的内容。所以，如果想要修改，则需要重新将内存中的内容写到文件。

 1 from xml.etree import ElementTree as ET
 2 
 3 ############ 解析方式一 ############
 4 
 5 # 打开文件，读取XML内容
 6 str_xml = open('xo.xml', 'r').read()
 7 
 8 # 将字符串解析成xml特殊对象，root代指xml文件的根节点
 9 root = ET.XML(str_xml)
10 
11 ############ 操作 ############
12 
13 # 顶层标签
14 print(root.tag)
15 
16 # 循环所有的year节点
17 for node in root.iter('year'):
18     # 将year节点中的内容自增一
19     new_year = int(node.text) + 1
20     node.text = str(new_year)
21 
22     # 设置属性
23     node.set('name', 'alex')
24     node.set('age', '18')
25     # 删除属性
26     del node.attrib['name']
27 
28 
29 ############ 保存文件 ############
30 tree = ET.ElementTree(root)
31 tree.write("newnew.xml", encoding='utf-8')
32 
33 解析字符串方式，修改，保存

View Code

 1 from xml.etree import ElementTree as ET
 2 
 3 ############ 解析方式二 ############
 4 
 5 # 直接解析xml文件
 6 tree = ET.parse("xo.xml")
 7 
 8 # 获取xml文件的根节点
 9 root = tree.getroot()
10 
11 ############ 操作 ############
12 
13 # 顶层标签
14 print(root.tag)
15 
16 # 循环所有的year节点
17 for node in root.iter('year'):
18     # 将year节点中的内容自增一
19     new_year = int(node.text) + 1
20     node.text = str(new_year)
21 
22     # 设置属性
23     node.set('name', 'alex')
24     node.set('age', '18')
25     # 删除属性
26     del node.attrib['name']
27 
28 
29 ############ 保存文件 ############
30 tree.write("newnew.xml", encoding='utf-8')
31 
32 解析文件方式，修改，保存

View Code

d、删除节点

 1 from xml.etree import ElementTree as ET
 2 
 3 ############ 解析字符串方式打开 ############
 4 
 5 # 打开文件，读取XML内容
 6 str_xml = open('xo.xml', 'r').read()
 7 
 8 # 将字符串解析成xml特殊对象，root代指xml文件的根节点
 9 root = ET.XML(str_xml)
10 
11 ############ 操作 ############
12 
13 # 顶层标签
14 print(root.tag)
15 
16 # 遍历data下的所有country节点
17 for country in root.findall('country'):
18     # 获取每一个country节点下rank节点的内容
19     rank = int(country.find('rank').text)
20 
21     if rank > 50:
22         # 删除指定country节点
23         root.remove(country)
24 
25 ############ 保存文件 ############
26 tree = ET.ElementTree(root)
27 tree.write("newnew.xml", encoding='utf-8')
28 
29 解析字符串方式打开，删除，保存

View Code

 1 from xml.etree import ElementTree as ET
 2 
 3 ############ 解析文件方式 ############
 4 
 5 # 直接解析xml文件
 6 tree = ET.parse("xo.xml")
 7 
 8 # 获取xml文件的根节点
 9 root = tree.getroot()
10 
11 ############ 操作 ############
12 
13 # 顶层标签
14 print(root.tag)
15 
16 # 遍历data下的所有country节点
17 for country in root.findall('country'):
18     # 获取每一个country节点下rank节点的内容
19     rank = int(country.find('rank').text)
20 
21     if rank > 50:
22         # 删除指定country节点
23         root.remove(country)
24 
25 ############ 保存文件 ############
26 tree.write("newnew.xml", encoding='utf-8')
27 
28 解析文件方式打开，删除，保存

View Code

3、创建XML文档

方法一

 1 from xml.etree import ElementTree as ET
 2 
 3 
 4 # 创建根节点
 5 root = ET.Element("famliy")
 6 
 7 
 8 # 创建节点大儿子
 9 son1 = ET.Element('son', {'name': '儿1'})
10 # 创建小儿子
11 son2 = ET.Element('son', {"name": '儿2'})
12 
13 # 在大儿子中创建两个孙子
14 grandson1 = ET.Element('grandson', {'name': '儿11'})
15 grandson2 = ET.Element('grandson', {'name': '儿12'})
16 son1.append(grandson1)
17 son1.append(grandson2)
18 
19 
20 # 把儿子添加到根节点中
21 root.append(son1)
22 root.append(son1)
23 
24 tree = ET.ElementTree(root)
25 tree.write('oooo.xml',encoding='utf-8', short_empty_elements=False)
26 
27 创建方式（一）

View Code

方法二

 1 from xml.etree import ElementTree as ET
 2 
 3 # 创建根节点
 4 root = ET.Element("famliy")
 5 
 6 
 7 # 创建大儿子
 8 # son1 = ET.Element('son', {'name': '儿1'})
 9 son1 = root.makeelement('son', {'name': '儿1'})
10 # 创建小儿子
11 # son2 = ET.Element('son', {"name": '儿2'})
12 son2 = root.makeelement('son', {"name": '儿2'})
13 
14 # 在大儿子中创建两个孙子
15 # grandson1 = ET.Element('grandson', {'name': '儿11'})
16 grandson1 = son1.makeelement('grandson', {'name': '儿11'})
17 # grandson2 = ET.Element('grandson', {'name': '儿12'})
18 grandson2 = son1.makeelement('grandson', {'name': '儿12'})
19 
20 son1.append(grandson1)
21 son1.append(grandson2)
22 
23 
24 # 把儿子添加到根节点中
25 root.append(son1)
26 root.append(son1)
27 
28 tree = ET.ElementTree(root)
29 tree.write('oooo.xml',encoding='utf-8', short_empty_elements=False)
30 
31 创建方式（二）

View Code

方法三

 1 from xml.etree import ElementTree as ET
 2 
 3 
 4 # 创建根节点
 5 root = ET.Element("famliy")
 6 
 7 
 8 # 创建节点大儿子
 9 son1 = ET.SubElement(root, "son", attrib={'name': '儿1'})
10 # 创建小儿子
11 son2 = ET.SubElement(root, "son", attrib={"name": "儿2"})
12 
13 # 在大儿子中创建一个孙子
14 grandson1 = ET.SubElement(son1, "age", attrib={'name': '儿11'})
15 grandson1.text = '孙子'
16 
17 
18 et = ET.ElementTree(root)  #生成文档对象
19 et.write("test.xml", encoding="utf-8", xml_declaration=True, short_empty_elements=False)
20 
21 创建方式（三）

View Code

由于原生保存的XML时默认无缩进，如果想要设置缩进的话，需要修改保存方式：

 1 from xml.etree import ElementTree as ET
 2 from xml.dom import minidom
 3 
 4 
 5 def prettify(elem):
 6     """将节点转换成字符串，并添加缩进。
 7     """
 8     rough_string = ET.tostring(elem, 'utf-8')
 9     reparsed = minidom.parseString(rough_string)
10     return reparsed.toprettyxml(indent="	")
11 
12 # 创建根节点
13 root = ET.Element("famliy")
14 
15 
16 # 创建大儿子
17 # son1 = ET.Element('son', {'name': '儿1'})
18 son1 = root.makeelement('son', {'name': '儿1'})
19 # 创建小儿子
20 # son2 = ET.Element('son', {"name": '儿2'})
21 son2 = root.makeelement('son', {"name": '儿2'})
22 
23 # 在大儿子中创建两个孙子
24 # grandson1 = ET.Element('grandson', {'name': '儿11'})
25 grandson1 = son1.makeelement('grandson', {'name': '儿11'})
26 # grandson2 = ET.Element('grandson', {'name': '儿12'})
27 grandson2 = son1.makeelement('grandson', {'name': '儿12'})
28 
29 son1.append(grandson1)
30 son1.append(grandson2)
31 
32 
33 # 把儿子添加到根节点中
34 root.append(son1)
35 root.append(son1)
36 
37 
38 raw_str = prettify(root)
39 
40 f = open("xxxoo.xml",'w',encoding='utf-8')
41 f.write(raw_str)
42 f.close()

View Code

十三 shutil

高级的文件、文件夹、压缩包处理模块

shutil.copyfileobj(fsrc, fdst[, length])
将文件内容拷贝到另一个文件中

1 import shutil
2  
3 shutil.copyfileobj(open('old.xml','r'), open('new.xml', 'w'))

shutil.copyfile(src, dst)
拷贝文件

1 shutil.copyfile('f1.log', 'f2.log')

shutil.copymode(src, dst)
仅拷贝权限。内容、组、用户均不变

1 shutil.copymode('f1.log', 'f2.log')

shutil.copystat(src, dst)

仅拷贝状态的信息，包括：mode bits, atime, mtime, flags

1 shutil.copystat('f1.log', 'f2.log')

shutil.copy(src, dst)
拷贝文件和权限

1 shutil.copy('f1.log', 'f2.log')

shutil.copy2(src, dst)
拷贝文件和状态信息

1 shutil.copy2('f1.log', 'f2.log')

shutil.ignore_patterns(*patterns)
shutil.copytree(src, dst, symlinks=False, ignore=None)
递归的去拷贝文件夹

1 shutil.copytree('folder1', 'folder2', ignore=shutil.ignore_patterns('*.pyc', 'tmp*'))

1 shutil.copytree('f1', 'f2', symlinks=True, ignore=shutil.ignore_patterns('*.pyc', 'tmp*'))

shutil.rmtree(path[, ignore_errors[, onerror]])
递归的去删除文件

1 shutil.rmtree('folder1')

shutil.move(src, dst)
递归的去移动文件，它类似mv命令，其实就是重命名。

1 shutil.move('folder1', 'folder3')

shutil.make_archive(base_name, format,...)

创建压缩包并返回文件路径，例如：zip、tar

base_name：压缩包的文件名，也可以是压缩包的路径。只是文件名时，则保存至当前目录，否则保存至指定路径，
如：www =>保存至当前路径
如：/Users/wupeiqi/www =>保存至/Users/wupeiqi/
format：压缩包种类，“zip”, “tar”, “bztar”，“gztar”
root_dir：要压缩的文件夹路径（默认当前目录）
owner：用户，默认当前用户
group：组，默认当前组
logger：用于记录日志，通常是logging.Logger对象

1 #将 /Users/wupeiqi/Downloads/test 下的文件打包放置当前程序目录
2 import shutil
3 ret = shutil.make_archive("wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test')
4   
5   
6 #将 /Users/wupeiqi/Downloads/test 下的文件打包放置 /Users/wupeiqi/目录
7 import shutil
8 ret = shutil.make_archive("/Users/wupeiqi/wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test')

shutil 对压缩包的处理是调用 ZipFile 和 TarFile 两个模块来进行的，详细：

 1 import zipfile
 2 
 3 # 压缩
 4 z = zipfile.ZipFile('laxi.zip', 'w')
 5 z.write('a.log')
 6 z.write('data.data')
 7 z.close()
 8 
 9 # 解压
10 z = zipfile.ZipFile('laxi.zip', 'r')
11 z.extractall()
12 z.close()

zipfile

 1 import tarfile
 2 
 3 # 压缩
 4 tar = tarfile.open('your.tar','w')
 5 tar.add('/Users/wupeiqi/PycharmProjects/bbs2.log', arcname='bbs2.log')
 6 tar.add('/Users/wupeiqi/PycharmProjects/cmdb.log', arcname='cmdb.log')
 7 tar.close()
 8 
 9 # 解压
10 tar = tarfile.open('your.tar','r')
11 tar.extractall()  # 可设置解压地址
12 tar.close()

tarfile

学习PYTHON之路， DAY 6 一 安装，导入模块 三 序列化 四 time & datetime 六 模块中的特殊变量 七 sys 八 os 九 hashlib 十 re 十一 ConfigParser 十二 xml 十三 shutil

二 random

random.random

random.randint

random.randrange

random.shuffle

三 序列化

四 time & datetime

五 logging

六 模块中的特殊变量

七 sys

八 os

九 hashlib

十 re

match

search

findall

sub

split

十一 ConfigParser

十二 xml

1、解析XML

2、操作XML

十三 shutil

相关推荐

学习PYTHON之路， DAY 6 一安装，导入模块三序列化四 time & datetime 六模块中的特殊变量七 sys 八 os 九 hashlib 十 re 十一 ConfigParser 十二 xml 十三 shutil

三序列化

六模块中的特殊变量

`九 hashlib`