PHP perl 正则表达式 - URL 前面不带等号并且可能有单引号或双引号

Question

我正在尝试创建一个 perl 正则表达式，该表达式与前面没有等号和一个单引号或双引号（可选）的 URL 匹配，忽略空格。下面的代码给出了错误：

Warning: preg_replace(): Compilation failed: lookbehind assertion is not fixed length at offset 0

我知道我的 URL 正则表达式并不完美，但我更关注如何进行负向后查找或如何以其他方式表达这一点。

例如，在下面的代码中，在匹配项中，应该输出 http://www.url1.com/ 和 http://www.url3.com/，而不是其他 URL。我怎样才能做到这一点？下面的代码给出了警告并且不会填充 $matches 变量。

PHP 代码：

$html = "
http://www.url1.com/
= ' http://www.url2.com/
'http://www.url3.com/
<a href='http://www.url4.com/'>Testing1</a>
<img src='https://url5.com'>Testing2</a>";

$url_pregex = '((http(s)?://)[-a-zA-Z()0-9@:%_+.~#?&;//=]+)';
$pregex = '(?<!\\s*=\\s*[\'"]?\\s*)'.$url_pregex;

preg_match('`'.$pregex.'`i', $html, $matches);

echo "Matches<br><pre>";
var_export($matches);
echo "</pre>";

PHP 中的 Perl 正则表达式，使用 ` 代替 /:

'`(?<!\\s*=\\s*[\'"]?\\s*)((http(s)?://)[-a-zA-Z()0-9@:%_+.~#?&;//=]+)`i'

Answer 1

解决此问题的一种方法是使用交替，其中第一部分匹配前面有

的 URL，第二部分仅匹配随后捕获的 URL。这是有效的，因为交替的第一部分总是首先测试，因此只有前面没有

的 URL 才会被交替的第二部分捕获。

为了简单起见，我已从您的

$url_pregex

中删除了捕获组；如果您想要它们，您需要调整此代码中

$matches

上的组编号以获得完整的匹配项。

$html = "
http://www.url1.com/
= ' http://www.url2.com/
'http://www.url3.com/
<a href='http://www.url4.com/'>Testing1</a>
<img src = 'https://url5.com'>Testing2</a>";

$url_pregex = 'https?://[-a-zA-Z()0-9@:%_+.~#?&;//=]+';
$pregex = "\\s*=\\s*['\"]?\\s*$url_pregex|($url_pregex)";

preg_match_all('`' . $pregex . '`i', $html, $matches);

echo "Matches<br><pre>";
var_export(array_values(array_filter($matches[1])));
echo "</pre>";

输出：

Matches<br><pre>array (
  0 => 'http://www.url1.com/',
  1 => 'http://www.url3.com/',
)</pre>

在 3v4l.org

进行演示

请注意，您需要使用

preg_match_all

来获取文本中的所有匹配项。

PHP perl 正则表达式 - URL 前面不带等号并且可能有单引号或双引号

问题描述投票：0回答：1

1个回答

最新问题

PHP perl 正则表达式 - URL 前面不带等号并且可能有单引号或双引号

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1