我正试图格式化以下文件。
[30-05-2013 15:45:54] A A
[26-06-2013 14:44:44] B A
[26-06-2013 14:44:44] C A
[26-06-2013 14:43:16] Some lines are so large, they take multiple lines, so explode('\n') won't work because
I need the complete message
[26-06-2013 14:44:44] E A
[26-06-2013 14:44:44] F A
[26-06-2013 14:44:44] G A
预期的输出。
Array
(
[0] => [30-05-2013 15:45:54] A A
[1] => [26-06-2013 14:44:44] B A
[2] => [26-06-2013 14:44:44] C A
[3] => [26-06-2013 14:43:16] Some lines are so large, they take multiple lines, so
explode('\n') won't work because
I need the complete message
[4] => [26-06-2013 14:44:44] E A
...
)
(?<=\[)(.+)(?<=\])(.+)
在下面的PHP代码中使用。
#!/usr/bin/env php
<?php
class Chat {
function __construct() {
// Read chat file
$this->f = file_get_contents(__DIR__ . '/testchat.txt');
// Split on '[\d]'
$r = "/(?<=\[)(.+)(?<=\])(.+)/";
$l = preg_split($r, $this->f, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
var_dump(count($l));
var_dump($l);
}
}
$c = new Chat();
这给我的输出如下。
array(22) {
[0]=>
string(1) "["
[1]=>
string(20) "30-05-2013 15:45:54]"
[2]=>
string(4) " A A"
[3]=>
string(2) "
["
[4]=>
string(20) "26-06-2013 14:44:44]"
[5]=>
string(4) " B A"
[6]=>
string(2) "
["
[7]=>
string(20) "26-06-2013 14:44:44]"
[8]=>
string(4) " C A"
[9]=>
string(2) "
["
[10]=>
string(20) "26-06-2013 14:43:16]"
[11]=>
string(87) " Some lines are so large, they take multiple lines, so explode('\n') won't work because"
[12]=>
string(30) "
I need the complete message
["
疑问
[
被忽略?PREG_SPLIT_NO_EMPTY
?用 preg_split
,您可以使用
'~\R+(?=\[\d{2}-\d{2}-\d{4} \d{2}:\d{2}:\d{2}])~'
见 搜索引擎演示
详细内容
\R+
- 1+换行符(?=\[\d{2}-\d{2}-\d{4} \d{2}:\d{2}:\d{2}])
- 在当前位置的右侧,需要一个正向的展望。\[
- a [
烧焦\d{2}-\d{2}-\d{4}
- 日字型,两位数,连字符,两位数,连字符和两位数。
- 空地\d{2}:\d{2}:\d{2}]
- 时间型图案,2位数。:
,2位数。:
,2位数。PHP演示。
$text = "[30-05-2013 15:45:54] A A
[26-06-2013 14:44:44] B A
[26-06-2013 14:44:44] C A
[26-06-2013 14:43:16] Some lines are so large, they take multiple lines, so explode('\n') won't work because
I need the complete message
[26-06-2013 14:44:44] E A
[26-06-2013 14:44:44] F A
[26-06-2013 14:44:44] G A";
print_r(preg_split('~\R+(?=\[\d{2}-\d{2}-\d{4} \d{2}:\d{2}:\d{2}])~', $text));
输出。
Array
(
[0] => [30-05-2013 15:45:54] A A
[1] => [26-06-2013 14:44:44] B A
[2] => [26-06-2013 14:44:44] C A
[3] => [26-06-2013 14:43:16] Some lines are so large, they take multiple lines, so explode('
') won't work because
I need the complete message
[4] => [26-06-2013 14:44:44] E A
[5] => [26-06-2013 14:44:44] F A
[6] => [26-06-2013 14:44:44] G A
)
万一你需要得到更多的细节,而不仅仅是分裂,你可以使用一个 相配的 逼近
'~^\[(\d{2}-\d{2}-\d{4} \d{2}:\d{2}:\d{2})]\s*+(.*?)(?=\s*^\[(?1)]|\z)~ms'
请看 验证码,将其作为
preg_match_all('~^\[(\d{2}-\d{2}-\d{4} \d{2}:\d{2}:\d{2})]\s*+(.*?)(?=\s*^\[(?1)]|\z)~ms', $text, $matches)
它将匹配
^
- 句首\[(\d{2}-\d{2}-\d{4} \d{2}:\d{2}:\d{2})]
- 日期时间的细节(采集到第1组)。\s*+
- 0个以上的空格(占位)(.*?)
- 任何0+的字符,尽可能少的出现,直到第一次出现的(?=\s*^\[(?1)]|\z)
- 前瞻性的位置与紧随其后的位置相匹配。\s*
- 0+空格^
- 行首\[(?1)]
- [
,组1模式。]
|
-或 \z
- 弦的最末端。迟到的答案,但你也可以用。
$text = file_get_contents("testchat.txt");
preg_match_all('/(\[.*?\])([^\[]+)/im', $text, $matches, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($matches[0]); $i++) {
$date = $matches[1][$i];
$line = $matches[2][$i];
print("$date $line");
}