match the Nth word of a line containing a specific word
I'm trying to do to get the correct REGEX to do this task:
Match the Nth word of a line containing a specific word
For example:
Input:
this is the first line - blue
this is the second line - green
this is the third line - red
I want to match the 7th word of the lines containing the word « second »
Desired output:
green
Does anyone know how to do this?
I'm using http://rubular.com/ to test the REGEX.
I already tried out this REGEX without success - it is matching the next line
(.*second.*)(?<data>.*?s){7}(.*)
--- UPDATED ---
Example 2
Input:
this is the Foo line - blue
this is the Bar line - green
this is the Test line - red
I want to match the 4th word of the lines containing the word « red »
Desired output:
Test
In other words - the word I want to match can come either before or after the word I use to select the line
You can use this to match a line containing second
and grab the 7th word:
^(?=.*bsecondb)(?:S+ ){6}(S+)
Make sure that the global and multiline flags are active.
^
matches the beginning of a line.
(?=.*bsecondb)
is a positive lookahead to make sure there's the word second
in that particular line.
(?:S+ ){6}
matches 6 words.
(S+)
will get the 7th.
regex101 demo
You can apply the same principle with other requirements.
With a line containing red
and getting the 4th word...
^(?=.*bredb)(?:S+ ){3}(S+)
You asked for regex, and you got a very good answer.
Sometimes you need to ask for the solution, and not specify the tool.
Here is the one-liner that I think best suits your need:
awk '/second/ {print $7}' < inputFile.txt
Explanation:
/second/ - for any line that matches this regex (in this case, literal 'second')
print $7 - print the 7th field (by default, fields are separated by space)
I think it is much easier to understand than the regex - and it's more flexible for this kind of processing.
链接地址: http://www.djcxy.com/p/2160.html上一篇: 正则表达式与单词列表不匹配
下一篇: 匹配包含特定单词的行的第N个单词