正则表达式捕获单引号或双引号内的字符串(Regex to capture string inside single or double quotation marks)

我正在编写一个正则表达式来解析包含标记字段的行。 标签出现在等号之前,内容出现在等号后面,用单引号或双引号括起来。 对于大多数字段,内容用单引号括起来。 如果字段的内容包含单引号,则该字段用双引号括起来。 例如:

J=''K='6'2='A'6='&JOBNAM#'P='&USERNAME#'O='1,1'7=''Q='ABC.JCLLIB(TEST1)'a="'D08/APPL'"U='1'S='*ALL'T='0'V='0'R='H'W='H'

我的正则表达式工作,除了用双引号括起来的字段。

([JK26PO7QaUSTVRW])\=(?:(?:\"([^"])*\")|(?:\'([^']*)\'))

在Debuggex中测试

在Regexr中测试

对于上面示例中标记为a的字段, a =“'D08 / APPL'” ,a由捕获组1匹配,尾随的单引号由捕获组2捕获。我想捕获组2到在这种情况下捕获'D08 / APPL'

I am writing a regular expression to parse lines containing labeled fields. The label appears before an equals sign, and the content appears after the equalts sign, enclosed in either single or double quotation marks. For most fields, the content is enclosed in single quotation marks. If the content of the field contains single quotation marks, then the field is enclosed in double quotation marks. E.g.:

J=''K='6'2='A'6='&JOBNAM#'P='&USERNAME#'O='1,1'7=''Q='ABC.JCLLIB(TEST1)'a="'D08/APPL'"U='1'S='*ALL'T='0'V='0'R='H'W='H'

My regex works, except with fields enclosed in double quotation marks.

([JK26PO7QaUSTVRW])\=(?:(?:\"([^"])*\")|(?:\'([^']*)\'))

Test in Debuggex

Test in Regexr

For fields like the one labeled a in the example above, a="'D08/APPL'", the a is matched by capture group 1, and the trailing single quotation mark is captured by capture group 2. I want capture group 2 to capture 'D08/APPL' in this case.

最满意答案

这是你想要的事情:

\w=(["'])((?:(?!\1).)*)\1

它匹配并捕获引用 - 或者是'或" 。然后它使用负向前看来匹配第一场比赛的引用之外的任何字符。最后匹配的引号匹配;)

引号之间的所有内容都会被捕获到第二组。

请在regex101上查看 。

编辑

检查了你自己的尝试,唯一的错误是你把字符的量词放在组括号外的" -quoted组中。即捕获只包含最后一个字符不是 " 。 尝试:

([JK26PO7QaUSTVRW])\=(?:(?:\"([^"]*)\")|(?:\'([^']*)\')) ^ ^ / \ Here Not here

Is it something like this you're after:

\w=(["'])((?:(?!\1).)*)\1

It matches, and captures, a quote - either a ' or a ". Then it uses a negative look ahead to match any character except the quote from the first match. Finally a matching quote is matched ;)

Everything between the quotes get captured to the second group.

See it here at regex101.

Edit

Checked your own attempt, and the only mistake is that you placed the quantifier for characters inside the "-quoted group outside the group parenthesis. I.e. the capture consist only of the last character not being a ". Try:

([JK26PO7QaUSTVRW])\=(?:(?:\"([^"]*)\")|(?:\'([^']*)\')) ^ ^ / \ Here Not here

更多推荐