在 JavaScript 中,有没有办法迭代字符串的词汇标记?

问题描述 投票:0回答:2

鉴于我从端点收到的这个字符串:

"\u0000\u0000\u0000\u001A%<some-random-text\fcdtoolHxxx1-34e3-4069-b97c-xxxxxxxxxxx\u001E\n"

我想迭代字符串以转义以

\u
开头的每个序列。结果字符串将是:

"\\u0000\\u0000\\u0000\\u001A%<some-random-text\fcdtoolHxxx1-34e3-4069-b97c-xxxxxxxxxxx\\u001E\n"

注意

\f
\n
没有被转义。那么,我怎样才能只转义那些
\u
序列呢?

使用像这样的正则表达式是行不通的,因为序列

\f
\n
也会被替换,但它们应该保持不变。

function escapeUnicode(str: string) {
  return s.replace(/[\u0000-\u001F]/gu, function (chr) {
     return "\\u" + ("0000" + chr.charCodeAt(0).toString(16)).slice(-4);
  });
}

String.raw
,但除非您将字符串作为文字传递,否则它将不起作用。例如,在下面的代码中,我可以将其用作文字:

let s = String.raw`\u0000\u0000\u0000\u001A%<deployment-deploymentStepStart\fcdtoolHb3dccc41-8cf0-4069`;
var escaped = String.raw``;

for (let i = 0, j = i + 1; i < s.length - 1; i++,j=i+1) {
  let curChar = String.fromCharCode(s.charCodeAt(i));
  let nextChar = String.fromCharCode(s.charCodeAt(j));
  if (curChar === "\\" && nextChar === "u") {
      escaped += String.raw`\\u`;
      i++;
  } else {
     escaped += curChar;
  }
}

escaped += String.fromCharCode(s.charCodeAt(s.length - 1));

console.log(escaped);

但正如我上面提到的,文本来自端点,因此如果我们将其存储在变量中,然后尝试执行相同的 for 循环,它将不起作用。

let someVariable = "\u0000\u0000\u0000\u001A%<deployment-deploymentStepStart\fcdtoolHb3dccc41-8cf0-4069"
let s = String.raw({raw: someVariable});
// ... rest of the code above
javascript unicode escaping utf-16 rawstring
2个回答
3
投票

您可以使用 JSON.stringify 来实现这一点:

var examplestring = `\u0000\u0000\u0000\u001A%<some-random-text\fcdtoolHxxx1-34e3-4069-b97c-xxxxxxxxxxx\u001E\n`
//basic example
console.log(examplestring)
console.log(JSON.stringify(examplestring))
console.log(JSON.stringify(examplestring).replaceAll('\\u','\\\\u'))

//using your example code:
var s = JSON.stringify(examplestring);
var escaped =  String.raw``;

for (let i = 0, j = i + 1; i < s.length - 1; i++,j=i+1) {
let curChar = String.fromCharCode(s.charCodeAt(i));
let nextChar = String.fromCharCode(s.charCodeAt(j));
if (curChar === "\\" && nextChar === "u") {
escaped += String.raw`\\u`;
 i++;
 } else {
 escaped += curChar;
 }
}

escaped += String.fromCharCode(s.charCodeAt(s.length - 1));

console.log(escaped);


0
投票

这是 String.raw 的更简单的正则表达式

const escapeUnicode = (str) =>str.replace(/\u([\da-fA-F]{4})/g,  (match, grp) => `\\u${grp}`)

console.log(escapeUnicode(String.raw`\u0000\u0000\u0000\u001A%<some-random-text\fcdtoolHxxx1-34e3-4069-b97c-xxxxxxxxxxx\u001E\n`))

© www.soinside.com 2019 - 2024. All rights reserved.