C++ regex 正则表达式的使用(二)

2014-11-24 08:48:33 · 作者: · 浏览: 1
re character \W not word any character that is not an alphanumeric or underscore character \character character the character character as it is, without interpreting its special meaning within a regex expression.
Any character can be escaped except those which form any of the special character sequences above.
Needed for: ^ $ \ . * + ( ) [ ] { } | [class] character class the target character is part of the class [^class] negated character class the target character is not part of the class 注意了,在C++反斜杠字符(\)会被转义

std::regex e1 ("\\d");  //  \d -> 匹配数字字符
std::regex e2 ("\\\\"); //  \\ -> 匹配反斜杠字符

数量

characters times effects
* 0 or more The preceding atom is matched 0 or more times.
+ 1 or more The preceding atom is matched 1 or more times.
0 or 1 The preceding atom is optional (matched either 0 times or once).
{int} int The preceding atom is matched exactly int times.
{int,} int or more The preceding atom is matched int or more times.
{min,max} between min and max The preceding atom is matched at least min times, but not more than max.

注意了,模式 "(a+).*" 匹配 "aardvark" 将匹配到 aa,模式 "(a+ ).*" 匹配 "aardvark" 将匹配到 a

(用以匹配连续的多个字符):

characters description effects
(subpattern) Group Creates a backreference.
( :subpattern) Passive group Does not create a backreference.
注意了,第一种将创建一个反向引用,用于提取匹配到的内容,第二种则没有,相对来说性能方面也没这部分的开销

characters description condition for match
^ Beginning of line Either it is the beginning of the target sequence, or follows a line terminator.
$ End of line Either it is the end of the target sequence, or precedes a line terminator.
| Separator Separates two alternative patterns or subpatterns..

单个字符

[abc] 匹配 a, b 或 c.
[^xyz] 匹配任何非 x, y, z的字符

范围
[a-z] 匹配任何小写字母 (a, b, c, ..., z).
[abc1-5] 匹配 a, b , c, 或 1 到 5 的数字.

c++ regex还有一种类POSIX的写法

class description equivalent (with regex_traits, default locale)
[:alnum:] alpha-numerical character isalnum
[:alpha:] alphabetic character isalpha
[:blank:] blank character isblank
[:cntrl:] control character iscntrl
[:digit:] decimal digit character isdigit
[:graph:] character with graphical representation isgraph
[:lower:] lowercase letter islower
[:print:] printable character isprint
[:punct:] punctuation mark character ispunct
[:space:] whitespace character isspace
[:upper:] uppercase letter isupper
[:xdigit:] hexadecimal digit character isxdigit
[:d:] decimal digit character isdigit
[:w:] word character isalnum
[:s:] whitespace character isspace

参考:

http://blog.csdn.net/mycwq/article/details/18838151

http://www.cplusplus.com/reference/regex/