判斷當(dāng)前文本是否包含中文
issue:此方法只能判斷部分CJK字符(CJK統(tǒng)一漢字)
public boolean isChineseWord(String str) {
Pattern p = Pattern.compile("[\u4e00-\u9fa5]");
Matcher m = p.matcher(str);
return m.matches();
}
issue:完美判斷是否包含中文方法(根據(jù)Unicode編碼完美的判斷中文漢字和符號(hào))
private static boolean isChinese(char c) {
Character.UnicodeBlock ub = Character.UnicodeBlock.of(c);
if (ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS ||
ub == Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS||
ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A ||
ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_B||
ub == Character.UnicodeBlock.CJK_SYMBOLS_AND_PUNCTUATION ||
ub == Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS||
ub == Character.UnicodeBlock.GENERAL_PUNCTUATION) {
return true;
}
return false;
}