师傅: 前面学习了字符组、排除型字符组、字符组简记法、括号、括号的多种用途、量词、以及锚点和环视结构的使用 ,接下来介绍 正则表达式非常有用的功能:匹配模式。
徒弟:哎哟,不错哟!
匹配模式:
l 作用:改变某些结构的匹配规定
l 形式:
·I: Case Insensitive
·S: SingleLine(dot All)
·M:MultiLine
·X:Comment
I:不区分大小写的正则表达式匹配。
S:点号可以匹配任意字符,包括换行符
M:^和$可以匹配各个问本行的开始、和结束位置。
X:允许在复杂的正则表达式中添加注释。
I: 不区分大小写
l 作用:在匹配时,不对英文单词区分大小写
例子:
public class CaseInsensitive {
public static void main(String[] args) {
String str = "abc" ;
String regex = "ABC" ;
Pattern p = Pattern. compile (regex);
Matcher m = p.matcher(str);
if (m.find()){
System. out .println(str + "能够匹配正则:" + regex);
} else {
System. out .println(str + "不能够匹配正则:" + regex);
}
}
}
运行结果:
abc不能够匹配正则:ABC
默认情况下区分大小写,所以匹配无法成功。
下面指定不区分大小写模式:
Pattern p = Pattern. compile (regex,Pattern. CASE_INSENSITIVE );
运行结果:
abc能够匹配正则:ABC
int java.util.regex. .CASE_INSENSITIVE = 2 [0x2]
CASE_INSENSITIVE
public static final int CASE_INSENSITIVE
Enables case-insensitive matching.
By default, case-insensitive matching assumes that only characters in the US-ASCII charset are being matched. Unicode-aware case-insensitive matching can be enabled by specifying the flag in conjunction with this flag.
Case-insensitive matching can also be enabled via the embedded flag expression (?i).
Specifying this flag may impose a slight performance penalty.
See Also:
S: 单行模式
l 作用:更改点好.的匹配规定,点号也可以匹配换行符
之前在字符组简记法中指出:点号不能匹配换行符
这里指定模式S,则点号也可以匹配换行符。
例子:
public class DotMatchAll {
public static void main(String[] args) {
String str = "<a href=www.sina.com.cn>\nSINA\n</a>" ;
String regex = "<a href.*</a>" ;
Pattern p = Pattern. compile (regex);
Matcher m = p.matcher(str);
if (m.find()){
System. out .println(str + "能够匹配正则:" + regex);
} else {
System. out .println(str + "不能够匹配正则:" + regex);
}
}
}
运行结果:
<a href=www.sina.com.cn>
SINA
</a>不能够匹配正则:<a href.*</a>
\nSINA\n 这里使用了换行符。
用点星号来匹配 ,发现:
</a>不能够匹配正则:<a href.*</a>
那么修改匹配模式:
Pattern p = Pattern. compile (regex,Pattern. DOTALL );
运行结果:
<a href=www.sina.com.cn>
SINA
</a>能够匹配正则:<a href.*</a>
说明:
int java.util.regex. .DOTALL = 32 [0x20]
DOTALL
public static final int DOTALL
Enables dotall mode.
In dotall mode, the expression . matches any character, including a line terminator. By default this expression does not match line terminators.
Dotall mode can also be enabled via the embedded flag expression (?s). (The s is a mnemonic for "single-line" mode, which is what this is called in Perl.)
See Also: