我需要从字母的正文中验证希伯来文字:
你好,
与患者John Salivan的视频咨询兼容。咨询日期定于2017年2月23日20:45。
必须进行咨询
但我的正则表达式与文本不匹配
public static void findBadLines(String fileName) {
Pattern regexp = Pattern.compile(".*שלום,.*תואם ייעוץ וידאו עם המטופל John Salivan. .*מועד הייעוץ נקבע לתאריך .* בשעה.*..*לביצוע הייעוץ יש להכנס .*");
Matcher matcher = regexp.matcher("");
Path path = Paths.get(fileName);
//another way of getting all the lines:
//Files.readAllLines(path, ENCODING);
try (
BufferedReader reader = Files.newBufferedReader(path, ENCODING);
LineNumberReader lineReader = new LineNumberReader(reader);
){
String line = null;
while ((line = lineReader.readLine()) != null) {
matcher.reset(line); //reset the input
if (!matcher.find()) {
String msg = "Line " + lineReader.getLineNumber() + " is bad: " + line;
throw new IllegalStateException(msg);
}
}
}
catch (IOException ex){
ex.printStackTrace();
}
}
final static Charset ENCODING = StandardCharsets.UTF_8;
}
我是否正确,你不想检查给定输入中是否有任何希伯来语文本?
如果是这样,请使用正则表达式.*[\u0590-\u05ff]+.*
[\u0590-\u05ff]+
匹配一个或多个希伯来字符,.*
在您需要匹配其余输入之前和之后。
分别
Pattern regexp = Pattern.compile(".*[\u0590-\u05ff]+.*");
//...
matcher.reset(line); //reset the input
if (!matcher.matches()) {
String msg = "Line " + lineReader.getLineNumber() + " is bad: " + line;
throw new IllegalStateException(msg);
}