正则基础知识(预言)

正则基础知识(断言)
一、单词边界的匹配
 使用\b能匹配单词边界，在\b所在的一边不是单词字符，单词字符的解释是\w能匹配的字符。例如：
 print re.findall(r"\b\w+\b", "a sentence\tcontains\na lot of words")
 # =>['a', 'sentence', 'contains', 'a', 'lot', 'of', 'words']

 单词边界匹配的是某个位置而不是文本，这类匹配位置的元素叫做锚点，常用的锚点还有^和$
 如果要匹配整个字符串的起始位置，也可以匹配换行符之后的位置，最简单的办法是在正则表达式前加(?m)。例如：
 string = "first line\nsecond line\r\nlast line"
 lineBeginWordRegex = r"(?m)^\w+"
 print re.findall(lineBeginWordRegex, string)
 # =>['first', 'second', 'last']

 ^和$的替换
 plainText = "line1\nline2\nline3"
 print re.sub(r"(?m)$", "", re.sub(r"(?m)^", "", plainText))
 # => line1
 line2
 line3

 使用r"(?m)^\s+"去除行首的空白字符，使用r"(?m)\s+$"去除行尾的空白字符
 withSpace = " begin\n between\t\n\nend"
 beginSapceRegex = r"(?m)^\s+"
 trimmedLeadingSpace = re.sub(beginSapceRegex, "", withSpace)
 print trimmedLeadingSpace
 # =>
 begin
 between
 end

相关推荐