如何使用 JavaScript 或 Cheerio 从字符串中删除空的 <p> 标签?

问题描述 投票:0回答:2

我有一些HTML作为一个字符串

"<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now. </p><p>​</p><p>Let's write another paragraph, and see how it renders when I read this post later. </p><p>​</p><p>This is another short paragraph</p>"

如何使用Cheerio或JS将这个字符串中的空p标签剥离出来。

我试着在Stack Overflow和Google上搜索了一下,都没有明确的工作方案。

EDIT:抱歉,我刚刚注意到我的字符串在标签之间有相当多的白色空间。

这里有一个例子,当我在应用中使用 console.log时出现的。

<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now. </p>
<p>​</p>
<p>Let's write another paragraph, and see how it renders when I read this post later. </p>
<p>​</p>
<p>Let's write another paragraph, and see how it renders when I read this post later. </p>
javascript html tags cheerio
2个回答
1
投票

我希望我已经帮助

var test = "<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now. </p><p>​</p><p>Let's write another paragraph, and see how it renders when I read this post later. </p><p>​</p><p>This is another short paragraph</p>";

var str = test.replace(/<p>​<\/p>/gi, '');

console.log(str);

1
投票

您可以使用 .replace("<p></p>", "") 如果标签没有任何属性,但如果有的话,还有另一种方法(除了使用regex来捕捉和替换标签)。

一个好的方法是使用本地DOM函数。

要删除空标签,可以使用下面的选择器。

document.querySelectorAll("*:empty").forEach((x)=>{x.remove()});

在你的例子中,也许可以使用类似这样的方法

var div = document.createElement("div");
div.innerHTML = "<p>hello there</p><p class='empty'></p><p>Not empty</p><p></p>"//your variable containing HTML here;
div.querySelectorAll("*:empty").forEach((x)=>{x.remove()})
// Output: div.innerHTML == <p>hello there</p><p>Not empty</p>
//Then use remaining innerHTML as you wish

但请注意 :empty 将无法使用空格,像这样 <p> </p>还注意到 :empty 将删除自闭标签


0
投票

您可以使用 replace 方法。

str = "<p>This is some HTML code</p>";
stripped = str.replace("<p>", "").replace("<\/p>", "");

console.log(stripped);

0
投票

你可以直接替换字符串 "<p></p>" 为空字符串 ""

var str = "<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now. </p><p></p><p>Let's write another paragraph, and see how it renders when I read this post later. </p><p></p><p>This is another short paragraph</p>";

str = str.replace(/<p>\s*<\/p>/ig, '');
str = str.replace(/<p\s*\/>/ig, '');

console.log(str);

0
投票
const regex = /<[^>]*>\s*<\/[^>]*>/;
const str = `<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now</p><p></p><p>Let's write another paragraph, and see how it renders when I read this post later. </p><p></p><p>This is another short paragraph</p>`;
let m;

 if ((m = regex.exec(str)) !== null) {
   // The result can be accessed through the `m`-variable.
    m.forEach((match, groupIndex) => {
       console.log(`Found match, group ${groupIndex}: ${match}`);
      });

试试这个


0
投票

你可以试试这个。

let str = "<p>This is a slightly longer post about something. Let's see how long this lasts. Okay so this is one paragraph now. </p><p></p><p>Let's write another paragraph, and see how it renders when I read this post later. </p><p></p><p>This is another short paragraph</p>";

// If your <p> element has attribtues then also it will be replaced.
str = str.replace(/<p(\s+[a-z0-9\-_\'\"=]+)*><\/p>/ig, '');

console.log(str);
.as-console-wrapper {min-height: 100%!important; top: 0;}
© www.soinside.com 2019 - 2024. All rights reserved.