RabbitMQ-在一个应用程序流程中为单个队列创建多个使用者是一种好习惯吗?

问题描述:

我只是处理一个由RabbitMQ支持的新项目,并且在应用程序启动时创建了多个消费者实例,以监听相同的队列.但是,它们与不同渠道共享相同的连接.

I just work with an new project backed by RabbitMQ, and there are multiple consumer instances created listening to the same queue when the application starts. Howerver they shares the same connections with different channels.

队列中的消息量很大(一个生产行为有数百万条消息),所以我猜想第一个代码编写者正在尝试做一些事情来加快消费速度.

The messages from the queue are massive(millions messages for one single producing behavior ) so I guess the very first code author is trying to do something to make consuming faster.

我正在尝试找到一些讨论此问题的帖子,但找不到非常确定的答案.

I am trying to find some posts discussing on this but I can't find a very certain answer.

到目前为止,我得到的是:

What I get so far is:

  1. 每个频道将有一个单独的调度线程
  2. 即使在多线程中调用,同一通道上的操作命令也会被序列化

所以

  1. 创建多个使用者从而使多个通道具有多个调度线程,但是我认为它不能为消息调度提供更好的性能,因为使用单个线程进行调度应该远远不够.

  1. creating multiple consumers thus multiple channels will have multiple dispatch threads, but I don't think it provided a better performance to message dispatching since the dispatch should far from enough with one single thread.

ack的操作可以在不同的通道中进行分析,我不确定这会带来更好的性能.

The operation of ack will can be paralized in different channels, I am not quite sure this will give any better performances.

由于更多的通道占用了更多的系统资源,我想知道这种做法好吗?

Since more channels consume more system resources I wonder is this practice good?

这里似乎发生了一些事情,所以让我们尝试从整体角度来看待这种情况.

There seem to be a few things going on here, so let's try to look at this scenario from a holistic perspective.

对于初学者来说,这听起来像是本代码的原始设计者了解了RabbitMQ的一些基础知识(或通过反复试验学到了一些东西),但是可能很难将所有部分放在一起-希望我能为您提供帮助.

For starters, it sounds like the original designer of this code understood some basics about RabbitMQ (or learned a few things by trial and error), but may have had trouble putting all the pieces together- hopefully I can help.

  1. RabbitMQ连接实际上是AMQP-over-TCP连接(因此位于 OSI模型).应该打开并使用TCP连接,直到某种形式的网络中断或应用程序关闭将它们关闭为止(由于这个原因,AMQP在防火墙和其他智能网络设备上遇到了麻烦).将单个TCP连接用于单个逻辑进程的消息处理活动是一个好主意,因为对于计算机而言,创建和销毁TCP连接通常是一个昂贵的进程,这导致

  1. RabbitMQ connections are, in reality, AMQP-over-TCP connections (and thus are somewhere around the session layer of the OSI model). TCP connections are supposed to be opened up and used until some sort of network interruption or application shutdown closes them (and for this reason, AMQP has trouble with firewalls and other smart network devices). Using a single TCP connection for message processing activities for a single logical process is a good idea, as creating and destroying TCP connections is usually an expensive process for the computer, which leads to

RabbitMQ通道用于复用AMQP-Over-TCP连接中的通信流(并在

RabbitMQ channels are used to multiplex communication streams in the AMQP-Over-TCP connection (and are defined in the AMQP Protocol Spec). All they do is specify an integer value (I can't remember the number of bytes, but it doesn't matter anyway) used to preface the subsequent command or response on a TCP connection. Most AMQP operations are channel-specific. For the purposes of higher-level operations, channels are treated similar to connections, as they are application-level constructs.

现在,我认为问题开始在这里出现了:

Now, where I think the question starts to go off the rails a bit is here:

来自队列的消息数量巨大(一百万个消息 单一生产行为),所以我想第一个代码作者是 尝试做一些事情以加快消费速度.

The messages from the queue are massive(millions messages for one single producing behavior ) so I guess the very first code author is trying to do something to make consuming faster.

有关使用队列的系统的基本假设是消息的消耗速率与产生消息的速率大致相同.存在队列以缓冲不均衡的生产活动.关于队列如何工作的数学和统计信息非常有趣,并且假设响应某些实际刺激而完成了消息的生成,则实际上可以保证您的系统以可预测的方式运行. 因此,您的设计目标是确保有足够的消费者来处理所生成的消息,并根据需要对不断变化的条件做出响应.您的目标不应该 加速"使用者(除非他们有一些特定的问题),而是要有足够的使用者来处理总负载.

A fundamental assumption about a system which uses queues is that messages are consumed at approximately the same rate that they are produced. Queues exist to buffer uneven producing activities. The mathematics and statistics of how queues work are quite interesting, and assuming the production of messages is done in response to some real-world stimulus, your system is virtually guaranteed to behave in a predictable manner. Therefore, your design goal is to ensure that there are enough consumers to process the messages that are produced, and to respond to changing conditions as needed. Your goal should not be to "speed up" the consumers (unless they have some specific issue), but rather to have enough consumers to process the total load.

此外,任何时候队列中的平均项目数应接近零.通常,有一个过剩的容量是个好主意,这样您就不会陷入不稳定的情况,在这种不稳定的情况下,消息开始在队列中累积(队列最终看起来像

Further, the average number of items in the queue at any time should approach zero. It is usually a good idea to have overcapacity so that you don't wind up with an unstable situation where messages start accumulating in the queue (and the queue ends up looking like the Stack Overflow Close Vote Queue).

这使我们尝试回答您的基本问题,该问题似乎涉及Java客户端的线程和可能的详细实现,我很容易承认我没有使用过(我是.NET专家).

And that brings us to an attempt to answer your fundamental question, which seems to deal with threading and possibly detailed implementation of the Java client, which I will readily admit I have not used (I'm a .NET guy).

以下是您的软件的一些设计准则:

Here are some design guidelines for your software:

  1. 确保单个线程使用的通道不超过一个.
  2. 每个逻辑使用进程使用一个TCP连接.
  3. 在单个物理计算机上平衡逻辑进程的数量,以使资源争用不成问题(您不想让计算机资源的使用者挨饿).
  4. 尝试使用BASIC.GET而不是基于推送的使用者.消费者的使用在实践中很困难,并且在协议级别上没有BASIC.GET带来的性能优势. 注意我不知道Java库是否以不同的方式实现了这些功能,从而导致性能差异-发生了一些奇怪的事情.
  5. 如果确实使用使用者,请确保将预取设置为0(禁用),并且如果可靠处理很重要(大多数应用程序需要可靠处理),则将AutoAck设置为false.除此之外,请确保在处理完成后确认消息!
  6. 定期重新启动消耗线程,通道和处理器-或执行BASIC.Recover.某些程度的随机性会导致未确认的消息随着时间的推移而累积,这将对此进行处理.
  7. 同样,如果您更喜欢使用消费者,通常来说跨渠道共享消费者是一个坏主意.每个消费者都应该拥有自己的渠道.
  1. Ensure that a single thread uses no more than one channel.
  2. Use one TCP connection per logical consuming process.
  3. Balance the number of logical processes on a single physical machine such that resource contention is not a problem (you don't want to starve your consumers of computer resources).
  4. Try to use BASIC.GET as opposed to a push-based consumer. Use of consumers is difficult in practice, and there is no performance benefit at the protocol level over a BASIC.GET. Note I do not know if the Java library has implemented these differently such that it does cause a performance difference- stranger things have been known to happen.
  5. If you do use consumers, make sure pre-fetch is set to 0 (disabled) and that AutoAck is set to false if reliable processing is important (most applications require reliable processing). Along with this, make sure you are acknowledging messages upon completion of processing!
  6. Periodically reboot your consuming threads, channels, and processors - or do a BASIC.Recover. There are degrees of randomness that will result in unacknowledged messages accumulating over time, and this will deal with it.
  7. Again, if you prefer to use consumers, generally speaking to share consumers across channels is a bad idea. Each consumer should get its own channel.