从管道中断读取时,管道Python脚本占用100%的CPU

问题描述:

我有两个在Ubuntu Linux机器上运行的Python脚本.第一个将其所有输出发送到stdout,第二个从stdin读取.它们通过一条简单的管道连接,即类似这样的管道:

I have two Python scripts running on an Ubuntu Linux machine. The 1st one sends all its output into stdout, the second one reads from stdin. They are connected by a simple pipe, i.e. something like this:

./step1.py <some_args> | ./step2.py <some_other_args>

step2所做的是,它无限循环地读取输入行并对其进行处理:

What step2 does is that it reads lines of input in an infinite loop and processes them:

while True:
    try:
        l = sys.stdin.readline()
        # processing here

Step1有时会崩溃.当发生这种情况时(不确定是否总是但至少在某些情况下),是不是崩溃/停止,step2变得疯狂并开始占用100%的CPU,直到我手动杀死它为止.

Step1 crashes from time to time. When that happens (not sure if always but at least on several occasions) is that instead of crashing/stopping, step2 goes crazy and starts taking 100% of the CPU until I manually kill it.

为什么会发生这种情况,如何使step2更加健壮,以便在管道破裂时停止运行?

Why is this happening and how can I make step2 more robust so that it stops when the pipe is broken?

谢谢!

其他人已经解释了为什么在某些情况下会陷入无休止的循环.

Others already explained why you end up in an endless loop in certain cases.

在第二个(阅读)脚本中,您可以使用以下成语:

In the second (reading) script, you can use the idiom:

for line in sys.stdin:
    process(line)

这样,您将不会陷入无尽的循环.此外,您实际上并未在第二个脚本中显示要尝试捕获的异常,但我想您会不时遇到断线"错误,您可以并且应该如此处所述:如何处理python?

This way you will not end up in an endless loop. Furthermore, you did not actually show which exception you try to catch in the second script, but I guess that from time to time you'll experience a 'broken pipe' error, which you can and should catch as described here: How to handle a broken pipe (SIGPIPE) in python?

整个方案如下:

try:
    for line in sys.stdin:
        process(line)
except IOError, e:
    if e.errno == errno.EPIPE:
        # EPIPE error
    else:
        # Other error