env and process names: portability at a price?

2018-05-31 20:30:29

There are lots of good reasons to use #! /usr/bin/env. Bottom line: It makes your code more portable. Well, sorta. Check this out....

I have two nearly identical scripts, bintest.py

#! /usr/bin/python
import time
time.sleep(5*60)

and envtest.py

#! /usr/bin/env python
import time
time.sleep(5*60)

Note that they are only different in their shebangs.

bintest.py runs as expected

br@carina:~$ ./bintest.py & ps && killall bintest.py
[1] 15061
  PID TTY          TIME CMD
14625 pts/0    00:00:00 bash
15061 pts/0    00:00:00 bintest.py
15062 pts/0    00:00:00 ps
br@carina:~$ 
[1]+  Terminated              ./bintest.py

but envtest.py does something less-than-optimal

br@carina:~$ ./envtest.py & ps && killall envtest.py
[1] 15066
  PID TTY          TIME CMD
14625 pts/0    00:00:00 bash
15066 pts/0    00:00:00 python
15067 pts/0    00:00:00 ps
envtest.py: no process found
br@carina:~$ killall python
br@carina:~$ 
[1]+  Terminated              ./envtest.py

What we've seen is that using #! /usr/bin/env #! /usr/bin/env caused the process to receive the name "python" rather than "envtest.py", thus rendering our killall ineffective. On some level it seems like we've traded one kind of portability for another: we can now swap out python interpreters easily, but we've lost "kill-ability" on the command line. What's up with that? If there's a best-practice here for achieving both, what is it?

"kill-ability" on the command line can by addressed portably and reliably using the PID of the backgrounded process obtained from shell $! variable.

$ ./bintest.py & bg_pid=$! ; echo bg_pid=$bg_pid ; ps && kill $bg_pid
[1] 2993
bg_pid=2993
  PID TTY          TIME CMD
 2410 pts/0    00:00:00 bash
 2993 pts/0    00:00:00 bintest.py
 2994 pts/0    00:00:00 ps
$ 
[1]+  Terminated              ./bintest.py
$

and envtest.py

$ ./envtest.py & bg_pid=$! ; echo bg_pid=$bg_pid ; ps && kill $bg_pid
[1] 3016
bg_pid=3016
  PID TTY          TIME CMD
 2410 pts/0    00:00:00 bash
 3016 pts/0    00:00:00 python
 3017 pts/0    00:00:00 ps
$ 
[1]+  Terminated              ./envtest.py
$

As @Adam Bryzak points out, neither script cause the process title to be set on Mac OS X. So, if that feature is a firm requirement, you may need to install and use python module setproctitle with your application.

This Stackoverflow post discusses setting process title in python

I don't think you can rely on the killall using the script name to work all the time. On Mac OS XI get the following output from ps after running both scripts:

 2108 ttys004    0:00.04 /usr/local/bin/python /Users/adam/bin/bintest.py
 2133 ttys004    0:00.03 python /Users/adam/bin/envtest.py

and running killall bintest.py results in

No matching processes belonging to you were found

While I would still like a solution that makes scripting languages both cross-platform and easy-to-monitor from the command line, if you're just looking for an alternative to killall <scriptname> to stop custom services, here's how I solved it:

kill `ps -fC <interpreterName> | sed -n '/<scriptName>/s/^[^0-9]*([0-9]*).*$/1/gp'`

For those not too familiar with ps and regexes, ps 's -f modifier has it list out a "full" set of information about a process, including its command-line arguments, and -C tells it to filter the list to only commands that match the next command-line argument. Replace <interpreterName> with python or node or whatever.

sed 's -n argument tells it to not print anything by default, and the regex script has to explicitly indicate that you want to print something.

In the regex, the first /<scriptName>/ tells it to filter its results to only lines that contain the interior regex. You can replace <scriptName> with envtest , for example.

The s indicates that a substitution regex will follow. /^[^0-9]*([0-9]*).*$/ being the line matching portion and /1/ being the substitution portion. In the line matching portion, the ^ at the very beginning and the $ at the very end mean that the match must start from the beginning of the line and end at the end of the line -- the entire line being checked is to be replaced.

The [^0-9]* involves a few things: [] are used to define a set of allowable characters. Within this portion of the regex, the dash - means a range of characters, so it expands to 0123456789 . The ^ here mean "not" and immediately means "match any character that is NOT a number". The asterisk * afterwards means to keep on matching characters in this set until it encounters a non-matching character, in this case a number.

The ([0-9]*) has two portions, the () and [0-9]* . The latter should be easy to follow from the previous explanation: it matches only numbers, and grabs as many as it can. The () mean to save the contents of what is matched to a temporary variable. (In other RegEx versions, including Javascript and Perl, () is used, instead.)

Finally, the .* means to match every remaining character, as . means any possible character.

The /1/ portion says to replace the matched portion of the line (which is the whole line in this case) with 1 , which is a reference to the saved temporary variable (if there had been two () sections, the first one in the RegEx would be 1 and the second 2 ).

The g afterwards mean to be "greedy" and run this matching code on every line encountered, and the p means to print any line that has reached this point.

Technically, this will blow up if you have multiple copies of your script running, and you'd really want the slightly heavier:

ps -fC <interpreterName> | sed -n '/<scriptName>/s/^[^0-9]*([0-9]*).$/kill 1/gp' | bash

If you want to truly replicate kill*all* functionality, but this spawns a separate bash shell for each script you'd like to kill.

链接地址: http://www.djcxy.com/p/8104.html

上一篇: 在Virtual Box，VMWare或Parallels上使用CPU的访客系统中使用OpenCL？

下一篇: ＃！ / usr / bin / env和进程名称：可移植性的价格？