我需要帮助在 javascript 中按空格(“”)分割字符串,忽略引号表达式内的空格。
我有这个字符串:
var str = 'Time:"Last 7 Days" Time:"Last 30 Days"';
我希望我的字符串被分割成 2:
['Time:"Last 7 Days"', 'Time:"Last 30 Days"']
但是我的代码分成了 4 个:
['Time:', '"Last 7 Days"', 'Time:', '"Last 30 Days"']
这是我的代码:
str.match(/(".*?"|[^"\s]+)(?=\s*|\s*$)/g);
谢谢!
s = 'Time:"Last 7 Days" Time:"Last 30 Days"'
s.match(/(?:[^\s"]+|"[^"]*")+/g)
// -> ['Time:"Last 7 Days"', 'Time:"Last 30 Days"']
解释:
(?: # non-capturing group
[^\s"]+ # anything that's not a space or a double-quote
| # or…
" # opening double-quote
[^"]* # …followed by zero or more chacacters that are not a double-quote
" # …closing double-quote
)+ # each match is one or more of the things described in the group
原来,要修复原来的表情,你只需要在组上添加一个
+
即可:
str.match(/(".*?"|[^"\s]+)+(?=\s*|\s*$)/g)
# ^ here.
ES6方案支持:
代码:
str.match(/\\?.|^$/g).reduce((p, c) => {
if(c === '"'){
p.quote ^= 1;
}else if(!p.quote && c === ' '){
p.a.push('');
}else{
p.a[p.a.length-1] += c.replace(/\\(.)/,"$1");
}
return p;
}, {a: ['']}).a
输出:
[ 'Time:Last 7 Days', 'Time:Last 30 Days' ]
这对我有用..
var myString = 'foo bar "sdkgyu sdkjbh zkdjv" baz "qux quux" skduy "zsk"'; console.log(myString.split(/([^\s"]+|"[^"]*")+/g));
输出: 数组 ["", "foo", " ", "bar", " ", ""sdkgyu sdkjbh zkdjv"", " ", "baz", " ", ""qux quux"", " ", "skduy" , " ", ""zsk"", ""]
我知道问题是关于正则表达式的,但是使用正则表达式执行的字符串拆分或子字符串提取等任务越多,您就越会注意到正则表达式的复杂性比字符串的复杂性增长得更快很多:首先,您必须避免拆分通过分隔符,如果它在引用区域中,您只需使用一些几乎随机的正则表达式,以某种方式工作;那么你需要尊重双引号区域,你修改你的正则表达式,使其增长 3-4 倍;等等。因此,尽管正则表达式解决方案看起来非常忍者和优雅,但我最终停止使用它并切换到尊重引用区域的自定义函数:
interface IZoneToken {
begin: string;
end: string;
}
const ZONE_TOKENS: { [key: string]: IZoneToken } = {
SINGLE_QUOTES: { begin: '\'', end: '\'' },
DOUBLE_QUOTES: { begin: '\"', end: '\"' },
ROUND_BRACKETS: { begin: '(' , end: ')' }
};
function splitRespectingZones(input: string, delimiter: string, zone_tokens: IZoneToken[]): string[] {
let current_substring = '',
current_zone_token;
const substrings = [],
current_zone_tokens = [];
for (let i = 0; i < input.length; i++) {
const symbol = input[i];
if (symbol === delimiter && !current_zone_tokens.length) {
substrings.push(current_substring);
current_substring = '';
} else {
if (current_zone_token = zone_tokens.find(x => symbol === x.begin || symbol === x.end)) {
if (current_zone_token === current_zone_tokens.last()) {
if (current_zone_token.end === symbol) {
current_zone_tokens.pop()
} else {
current_zone_tokens.push(current_zone_token)
}
} else {
current_zone_tokens.push(current_zone_token)
}
current_zone_token = undefined;
}
current_substring += symbol;
}
}
if (current_substring) {
substrings.push(current_substring)
}
return substrings;
}
如您所见,您可以使用任意数量的自定义区域标记,无论是双引号、大括号还是其他括号:
splitRespectingZones(`some 'ran dom' "str ings" (fo r) ("example")`, ' ', [
{ begin: '\'', end: '\'' },
{ begin: '\"', end: '\"' },
{ begin: '(' , end: ')' }
])
// ['some', "'ran dom'", '"str ings"', '(fo r)', '("example")']
这很冗长,但如果您需要修改此函数,您会喜欢它的逻辑是多么简单
问题案例示例:
splitRespectingZones('Time:"Last 7 Days" Time:"Last 30 Days"', ' ', [ { begin: '\"', end: '\"' } ])
// ['Time:"Last 7 Days"', 'Time:"Last 30 Days"']