这个问题在这里已有答案:
一个以下文本,我想提取引号内的值,例如“hash”。与哈希关联的值是从引号的开头到结尾,在这种情况下:
00000e96c46d15aeaaf9ef6f88a295a8f17207d4cd9ac074d2314680095befc854d5a00600602af2fe03a24b61566ca2d8a6b858b0af840309ae449316833923
我的模式是
Scanner s = new Scanner(new File(path.toString()));
Pattern pattern = Pattern.compile("\"hash\": \".*\"");
String nextMatch = s.findWithinHorizon(pattern, 0);
模式的解释:我看一下带引号的任何地方的序列,然后是单词hash和另一个引号。然后“:”跟随+ 1空格。之后会出现多个文本,直到出现另一个引号。
可悲的是,这种模式不起作用,我不明白为什么。
{“hash”:“00000e96c46d15aeaaf9ef6f88a295a8f17207d4cd5ac074d231468009500f4856d5a00600602af2fe03a24b61566ca2d8a6b858b0af840309ae449316833923”,“block”:“{\”type \“:\”block \“,\”transactions \“:[],\”timestamp \“:\”2017-09-07T07: 09:52.628676 \“,”奖励“:” 3e16c6d7f08f04f5067dc9a2d0c01015c1af848a1fcd6c64eef039c9c5d8e737c0655a97b6bc876854a34ad94fcd29218524c6c7881bd1ae4a9279edc12f95720d8a010d9a4c7dd19a4415bed2687fb462d95da8436954b5fd82d92b98935650a1fd7fa215ba95e8b20d8594c50cb9a8bc683af32133c007bc0dff3edd36e0c20688385891788de63a5adcbb \ “\ ”难度\“:\ ”0 \“,\ ”随机数\“:\ ”feec6d57f31d8aee18889026e4e484d96de6b874013a1932018e809c60c45019033389671dcc2e3138a555705cec95e365d79d3e68a909efcf15d0d137770131 \“,\ ”亲本\“:\ ”00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 \“}”, “类型”: “block_hash”}
我的整个代码:
public class TryToStream {
static String url = "SorryICantShowYouThatOne";
static String charset = "UTF-8";
public static void main(String[] args) throws IOException, ParseException {
JSONParser parser = new JSONParser();
URL getURL = new URL(url + "get?start_at=");
int counter = 0;
boolean inputAvail = true;
//clear textfile
PrintWriter pw = new PrintWriter("jsonFormatted.txt");
URL tmpURL = new URL(url + "get?start_at=" + counter);
URLConnection connection = tmpURL.openConnection();
InputStream is = connection.getInputStream();
JSONArray json = (JSONArray) parser.parse(new BufferedReader(new InputStreamReader(is)));
// FileOutputStream fos = new FileOutputStream(new File("output2.txt"), true);
BufferedWriter bw = new BufferedWriter(new FileWriter("jsonFormattedStream.txt"));
bw.write(json.toJSONString());
bw.close();
Iterator iter = json.iterator();
boolean flagForTesting = true;
BufferedWriter bw2 = new BufferedWriter(new FileWriter("jsonFormatted.txt"));
Pattern pattern = Pattern.compile("\"hash\": \"(.*?)\"");
while (iter.hasNext() && flagForTesting) {
Matcher matcher = pattern.matcher(iter.next().toString());
matcher.find();
System.out.println(matcher.group(1));
flagForTesting = false;
}
bw2.close();
System.out.println("End");
}
}
如果我尝试匹配建议的正则表达式,我没有得到匹配。
iter.next()的结果:
{“block”:“{\”type \“:\”block \“,\”transactions \“:[],\”timestamp \“:\”2017-09-07T07:09:52.628676 \“,\”奖励\”: \” dd19a4415bed2687fb462d95da8436954b5fd82d92b98935650a1fd7fa215ba95e8b20d8594c50cb9a8bc683af32133c007bc0dff3edd36e0c20688385891788de63a5adcbb \ “\ ”难度\“:\ ”0 \“,\ ”随机数\“:\ ”feec6d57f31d8aee18889026e4e484d96de6b874013a1932018e809c60c45019033389671dcc2e3138a555705cec95e365d79d3e68a909efcf15d0d137770131 \“,\ ”亲本\“:\ ”00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 \“}”, “类型”: “block_hash”, “哈希”: “00000e96c46d15aeaaf9ef6f88a295a8f17207d4cd9ac074d2314680095befc854d5a00600602af2fe03a24b61566ca2d8a6b858b0af840309ae449316833923”}
你的正则表达几乎就在那里!
你的正则表达式的问题是它试图匹配字符串中的所有内容,直到最后一个引号。因此它将一直匹配到"block_hash"
。你只需要告诉它与懒惰匹配,所以它会在遇到第一个引号时停止匹配。
"hash": ".*?" // notice the question mark!
现在这个正则表达式匹配:
"hash": "00000e96c46d15aeaaf9ef6f88a295a8f17207d4cd9ac074d2314680095befc854d5a00600602af2fe03a24b61566ca2d8a6b858b0af840309ae449316833923"
如果你想捕获引号内的东西,我建议你添加一个捕获组:
"hash": "(.*?)"
您可以像这样使用此正则表达式:
Pattern pattern = Pattern.compile("\"hash\": \"(.*?)\"");
Matcher matcher = pattern.matcher(yourString);
matcher.find();
System.out.println(matcher.group(1));