如何为以下条件编写正则表达式

问题描述 投票:1回答:3
  • 至少应包含4个字符
  • 应以字符[a-zA-Z]开头
  • 不应以_结尾
  • 整个单词中最多只能包含1个_
  • 单词可以包含[a-zA-Z0-9]

我尝试了以下正则表达式:

^[a-zA-Z][a-zA-Z0-9]*_?[a-zA-Z0-9]*[^_]$

但是然后我想知道是否可以将其做得更小,而且我不确定如何设置“至少4个字符”约束。

regex matching
3个回答
1
投票

您可以通过省略最后一个被否定的字符类[^_]来缩短模式,因为它将匹配除_以外的任何字符,并添加正向超前(?=.{4})以从字符串开头断言4个字符:

^(?=.{4})[a-zA-Z][a-zA-Z0-9]*_?[a-zA-Z0-9]+$
  • [^字符串的开头
  • [(?=.{4})声明4个字符
  • [[a-zA-Z]匹配单个字符a-zA-Z
  • [[a-zA-Z0-9]*_?[a-zA-Z0-9]+匹配可选的_以及左侧和/或右侧列出的任何一个
  • [$字符串结尾

Regex demo


0
投票

这完成了工作:

^(?i)(?=\w{4,})[a-z]+_?[^\W_]+$

说明:

^                   # beginning of line
  (?i)              # case insensitive
  (?=\w{4,})        # positive lookahead, make sure we have 4 or more word character
  [a-z]+            # 1 or more alphabet
  _?                # optional underscore
  [^\W_]+           # 1 or more alphanum
$

Demo & explanation


0
投票

验证可以结合使用正负前瞻:​​

import re

tests = [
    'abc', # too short
    '_bcde', # starts with wrong character
    'abcd_', # last character is '_'
    'a_b_cd' # too many '_',
    'abc&cd', # illegal character '&'
    'ab_cd' # OK
]


regex = re.compile(r"""
    ^               # matches start of the line
    (?=.{4})        # positive lookahead: matches any 4 characters (string must be at least 4 characters long)
    (?=[a-zA-Z])    # positive lookahead: next character must be [a-zA-Z]
    (?!.*_$)        # negative lookahead: last character cannot be `_`
    (?!.*_.*_)      # negative lookahead: cannot match more than one `_`
    [a-zA-Z_]+      # looking for one or more of these
    $               # looking for the end of the string
""", re.X)

for test in tests:
    m = regex.match(test)
    print(test, 'Match' if m else 'No match')

打印:

abc No match
_bcde No match
abcd_ No match
a_b_cdabc&cd No match
ab_cd Match

See Regex Demo

© www.soinside.com 2019 - 2024. All rights reserved.