Signalling all threads in a process

2018-06-17 16:55:12

Without keeping a list of current threads, I'm trying to see that a realtime signal gets delivered to all threads in my process. My idea is to go about it like this:

Initially the signal handler is installed and the signal is unblocked in all threads.

When one thread wants to send the 'broadcast' signal, it acquires a mutex and sets a global flag that the broadcast is taking place.

The sender blocks the signal (using pthread_sigmask ) for itself, and enters a loop repeatedly calling raise(sig) until sigpending indicates that the signal is pending (there were no threads remaining with the signal blocked).

As threads receive the signal, they act on it but wait in the signal handler for the broadcast flag to be cleared, so that the signal will remain masked.

The sender finishes the loop by unblocking the signal (in order to get its own delivery).

When the sender handles its own signal, it clears the global flag so that all the other threads can continue with their business.

The problem I'm running into is that pthread_sigmask is not being respected. Everything works right if I run the test program under strace (presumably due to different scheduling timing), but as soon as I run it alone, the sender receives its own signal (despite having blocked it..?) and none of the other threads ever get scheduled.

Any ideas what might be wrong? I've tried using sigqueue instead of raise , probing the signal mask, adding sleep all over the place to make sure the threads are patiently waiting for their signals, etc. and now I'm at a loss.

Edit: Thanks to psmears' answer, I think I understand the problem. Here's a potential solution. Feedback would be great:

At any given time, I can know the number of threads running, and I can prevent all thread creation and exiting during the broadcast signal if I need to.

The thread that wants to do the broadcast signal acquires a lock (so no other thread can do it at the same time), then blocks the signal for itself, and sends num_threads signals to the process, then unblocks the signal for itself.

The signal handler atomically increments a counter, and each instance of the signal handler waits until that counter is equal to num_threads to return.

The thread that did the broadcast also waits for the counter to reach num_threads , then it releases the lock.

One possible concern is that the signals will not get queued if the kernel is out of memory (Linux seems to have that issue). Do you know if sigqueue reliably informs the caller when it's unable to queue the signal (in which case I would loop until it succeeds), or could signals possibly be silently lost?

Edit 2: It seems to be working now. According to the documentation for sigqueue , it returns EAGAIN if it fails to queue the signal. But for robustness, I decided to just keep calling sigqueue until num_threads-1 signal handlers are running, interleaving calls to sched_yield after I've sent num_threads-1 signals.

There was a race condition at thread creation time, counting new threads, but I solved it with a strange (ab)use of read-write locks. Thread creation is "reading" and the broadcast signal is "writing", so unless there's a thread trying to broadcast, it doesn't create any contention at thread-creation.

raise() sends the signal to the current thread (only), so other threads won't receive it. I suspect that the fact that strace makes things work is a bug in strace (due to the way it works it ends up intercepting all signals sent to the process and re-raising them, so it may be re-raising them in the wrong way...).

You can probably get round that using kill(getpid(), <signal>) to send the signal to the current process as a whole.

However, another potential issue you might see is that sigpending() can indicate that the signal is pending on the process before all threads have received it - all that means is that there is at least one such signal pending for the process, and no CPU has yet become available to run a thread to deliver it...

Can you describe more details of what you're aiming to achieve? And how portable you want it to be? There's almost certainly a better way of doing it (signals are almost always a major headache, especially when mixed with threads...)

In multithreaded program raise(sig) is equivalent to pthread_kill(pthread_self(), sig). Try kill(getpid(), sig)

Given that you can apparently lock thread creation and destruction, could you not just have the "broadcasting" thread post the required updates to thread-local-state in a per-thread queue, which each thread checks whenever it goes to use the thread-local-state? If there's outstanding update(s), it first applies them.

链接地址: http://www.djcxy.com/p/50020.html

上一篇: 在Linux上如何执行异步信号处理程序？

下一篇: 指示进程中的所有线程