Referencing nested groups in JavaScript using string replace using regex
Because of the way that jQuery deals with script tags, I've found it necessary to do some HTML manipulation using regular expressions (yes, I know... not the ideal tool for the job). Unfortunately, it seems like my understanding of how captured groups work in JavaScript is flawed, because when I try this:
var scriptTagFormat = /<script .*?(src="(.*?)")?.*?>(.*?)</script>/ig;
html = html.replace(
scriptTagFormat,
'<span class="script-placeholder" style="display:none;" title="$2">$3</span>');
The script tags get replaced with the spans, but the resulting title
attribute is blank. Shouldn't $2
match the content of the src
attribute of a script tag?
Nesting of groups is irrelevant; their numbering is determined strictly by the positions of their opening parentheses within the regex. In your case, that means it's group #1 that captures the whole src="value"
sequence, and group #2 that captures just the value
part.
Try this:
/<script (?:(?!src).)*(?:src="(.*?)")?.*?>(.*?)</script>/ig
See here: rubular
As stema wrote, the .*?
matches too much. With the negative lookahead (?:(?!src).)*
you will match only until a src
attribute.
But actually in this case you could also just move the .*?
into the optional part:
/<script (?:.*?src="(.*?)")?.*?>(.*?)</script>/ig
See here: rubular
The .*?
matches too much because the following group is optional, ==> your src
is matched from one of the .*?
around. if you remove the ?
after your first group it works.
Update: As @morja pointed out your solution is to move the first .*?
into the optional src part.
Just for completeness: /<script (?:.*?(src="(.*?)"))?.*?>(.*?)</script>/ig
You can see it here on rubular (corrected my link also)
If you don't want to use the content of the first capturing group, then make it a non capturing group using (?:)
/<script (?:.*?(?:src="(.*?)"))?.*?>(.*?)</script>/ig
Then your wanted result is in $1 and $2.
链接地址: http://www.djcxy.com/p/74810.html上一篇: 如何分别替换组的每个捕获?