match the Nth word of a line containing a specific word

I'm trying to do to get the correct REGEX to do this task:

Match the Nth word of a line containing a specific word

For example:

Input:

this is the first line - blue
this is the second line - green
this is the third line - red

I want to match the 7th word of the lines containing the word « second »

Desired output:

green

Does anyone know how to do this?

I'm using http://rubular.com/ to test the REGEX.

I already tried out this REGEX without success - it is matching the next line

(.*second.*)(?<data>.*?s){7}(.*)

--- UPDATED ---

Example 2

Input:

this is the Foo line - blue
this is the Bar line - green
this is the Test line - red

I want to match the 4th word of the lines containing the word « red »

Desired output:

Test

In other words - the word I want to match can come either before or after the word I use to select the line


You can use this to match a line containing second and grab the 7th word:

^(?=.*bsecondb)(?:S+ ){6}(S+)

Make sure that the global and multiline flags are active.

^ matches the beginning of a line.

(?=.*bsecondb) is a positive lookahead to make sure there's the word second in that particular line.

(?:S+ ){6} matches 6 words.

(S+) will get the 7th.

regex101 demo


You can apply the same principle with other requirements.

With a line containing red and getting the 4th word...

^(?=.*bredb)(?:S+ ){3}(S+)

You asked for regex, and you got a very good answer.

Sometimes you need to ask for the solution, and not specify the tool.

Here is the one-liner that I think best suits your need:

awk '/second/ {print $7}' < inputFile.txt

Explanation:

/second/     - for any line that matches this regex (in this case, literal 'second')
print $7     - print the 7th field (by default, fields are separated by space)

I think it is much easier to understand than the regex - and it's more flexible for this kind of processing.

链接地址: http://www.djcxy.com/p/2160.html

上一篇: 正则表达式与单词列表不匹配

下一篇: 匹配包含特定单词的行的第N个单词