IPC-命名管道

@高效码农  November 26, 2023

在上一篇文章中,我们介绍了进程间通信及其不同的机制。我们将从第一个命名管道或 FIFO 文件开始!

命名管道是一种建立在匿名管道结构之上的机制。您的常规 Unix 管道实际上是匿名管道。要了解命名管道,我们需要了解匿名管道。

匿名管道

匿名管道是由内核创建和维护的内存缓冲区。该缓冲区有两个文件描述符用于引用它,一个用于读取,另一个用于写入。read您可以使用带有write适当描述符的系统调用来读取和写入数据到该缓冲区。写入此缓冲区的数据不会出现在磁盘上

匿名管道是单向的,这意味着您只能向一端写入并从一端读取​​。您可以使用该函数创建匿名管道pipe()|每当您在 shell 中使用该符号时,它都会调用该pipe()函数来创建您的程序可以使用的缓冲区(以及一些我们在此不会介绍的重定向技巧)。

关于匿名管道的一个重要事实是,数据一旦被读取就会从缓冲区中删除

想象一下您希望两个进程在不使用 shell 的情况下进行通信。您将如何使用匿名管道来做到这一点?简单的答案是你不能。当您调用该函数时,内核会创建只有调用者进程及其所有子进程知道的pipe()缓冲区。没有其他应用程序进程可以引用该缓冲区。如果您要在每个应用程序进程中调用该函数,则会创建两个独立的管道缓冲区,它们之间没有任何连接,从而违背了进程通信的目的。你可能会问,shell是怎么做到的呢?shell 程序中执行的每个进程都是该 shell 进程的子进程。请记住我所说的关于进程创建的匿名管道缓冲区如何只能由创建应用程序进程及其所有子进程知道:-)。这就是您的 shell 可以让单独的应用程序进程进行通信的方式。pipe()

我们已经看到,尽管匿名管道很酷,但它也有局限性。这种限制是知识,而不是共享。如果一个进程知道另一个进程的管道缓冲区,它就可以读取和写入该缓冲区。但这些知识被内核隐藏了。这是通过混淆来隔离!如果我们希望其他应用程序进程在不共享进程祖先的情况下读取我们的管道缓冲区怎么办?这就是命名管道的用途!

命名管道

In computing, creating a reference is the first step in allowing access to data. That is what Named Pipes is, Anonymous pipes plus reference. This reference is simply a file name. This file name is stored on disk and will appear as a file in your Explorer or directory listing. We can test this out by running these commands in your shell.

mkfifo example-pipes
ls -l 

The mkfifo command[1] creates a named pipe called example-pipes; Your output should be similar to this:

prw-r--r--  1 user  group    0 Sep 21 16:36 example-pipes

It’s just a file! That’s neat. You will notice that the first column starts with p. That means its type is a FIFO file.

Because it’s a file, any process can interact with it like every other regular file. That means you can open it, read from it, write to it, unlink it, close it, and so on. A bonus about the reference being a file is that it can have permissions. What this means is you can restrict who has access to it.

The difference between a regular file and a FIFO file is that it can never contain data. That will violate the purpose of a pipe, which is to act as a communication buffer between processes. The only reason for the file name is to be a reference, nothing more!

One important detail about named pipes is no bytes are written to a named pipe buffer until there is at least one concurrent reader.

Named pipes can be bidirectional. An application process can read from it and write to it with a single file descriptor. Once an application process has the correct permissions to open a named pipe, all it has to do is open the named pipe using its name in O_RDWR mode. This is tricky because a process can immediately read what it has written, causing the written data not to be visible to other application processes.

Show me the code

Our example will demonstrate two Python processes; a server and a client. The client will send a “ping” message to the server, and the server will print it out.

Here’s the client

    import os

    ROUNDS = 100

    def run():
        pipe_path = '/tmp/ping'
        fd = os.open(pipe_path, os.O_WRONLY)
        i = 0
        while i != ROUNDS:
            os.write(fd, b'ping')
            print("Client: Sent ping")
            i += 1
        os.write(fd, b'end')
        os.close(fd)

The client opens the named pipe referenced by the name /tmp/ping in write-only mode. It sends a “ping” message a hundred times to the server using the named pipe. It signifies that it’s finished by sending “end”. Once it’s finished sending, it closes the file. The write will block until at least one other process tries to read from the pipe.

Note that data is sent as a byte string, not as a regular string.

Here’s the server

    import os

    def run():
        pipe_path = '/tmp/ping'
        os.mkfifo(pipe_path)
        fd = os.open(pipe_path, os.O_RDONLY)
        data = os.read(fd, 4).decode()
        while data != 'end':
            print(f"Server: Received {data}")
            data = os.read(fd, 4).decode()
        os.close(fd)
        os.unlink(pipe_path)

Here, the mkfifo() function creates a named pipe called /tmp/ping. After creation, the server opens it in read-only mode. Data is read from the pipe buffer and printed to the console in a loop. The loop ends when the “end” message is received. The file is closed and deleted. Note that the read is blocked until there’s data in the pipe.

Note that data has to be decoded to a utf-8 string because it’s originally received as a byte string.

Performance

命名管道非常快。IPC-Bench在运行 Ubuntu 20.04.1 LTS 的 Intel(R) Core(TM) i5-4590S CPU @ 3.00GHz 上进行了每秒 254,880 条 1KB 消息的基准测试。这样的速度足以满足大多数进程的通信需求。

演示代码

您可以在GitHub上找到我的代码,该代码演示了使用命名管道的单向和双向通信。

结论

命名管道是一种简单而强大的IPC机制。与所有强大的工具一样,您必须谨慎使用它。我不建议将其用于双向通信,除非您不担心丢失数据。

本文与文件系统相关。下一篇将与网络相关:-)。我指的是Unix Domain Sockets。在那之前,照顾好自己并保持水分!✌



评论已关闭