Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Pub/sub subscription not sending out all messages at once

Hi all, I have a question re: pub/sub.

My set up is as follows

- I have a publisher creating about ~3000 messages which get published to a single subscription

- That subscription has >3000 subscription clients open, each with a streaming pull connection setup and ready to receive messages

- each client has a MaxOutstandingMessages set to 1, and the their thread counts are limited to 1

- I am using exactly once delivery, and I have a max ack deadline (client side) set to 60 minutes, with the min duration extension set to 9 minutes and max set to 10 minutes (the job I'm processing with each message is time intensive), messages come in bursts. Server side, I have the ack deadline set to 180 seconds 

I would expect, given the number of streaming pull connections I have open that my subscription should be sending out all the messages in one go, but it seems like about ~99% of my messages get sent immediately, and the remaining 1% seem to be sent after about 10 minutes.  I have noticed, on occassion, that this also seems to coincide with ack deadline expiries (which doesn't really seem to make sense to me given my deadlines are so high). 

My questions:

(1) why are all my messages not being disseminated at once? Is there any configuration setting I should be following?

(2) is my usage of max deadline, min extension / max extension / the server side extension correct?

I'm stumped, and would appreciate any help. Happy to share more info. If it's helpful, I'm using google-cloud-cpp v2.1.0 for my client library.

0 8 7,533
8 REPLIES 8

(1) Why are not all my messages being disseminated at once?

There are a few possible reasons why your messages are not being disseminated at once:

  • Message backlog: Pub/Sub will only deliver a certain number of messages to each subscriber at a time if there is a backlog of messages on the subscription. This is to ensure that subscribers do not consume resources faster than they can handle and to prevent any single subscriber from being overwhelmed.
  • Flow control: Pub/Sub uses flow control to ensure that subscribers do not consume resources faster than they can handle. If a subscriber is not able to keep up with the rate of message delivery, Pub/Sub will throttle the delivery rate.
  • Subscriber processing time: If the subscribers are taking a long time to process messages, the backlog of messages on the subscription will increase. This can lead to Pub/Sub throttling the delivery rate or delivering messages slowly to subscribers.

(2) Is my usage of max deadline, min extension / max extension / the server side extension correct?

Yes, your usage of max deadline, min extension / max extension / the server side extension is correct.

The max deadline is the maximum amount of time that a subscriber has to acknowledge the receipt of a message before it expires. The min extension and max extension allow you to extend the deadline for a message if it is not possible to acknowledge it within the max deadline.

The server-side ack deadline sets the default time a subscriber has to acknowledge the receipt of a message. While clients can request to extend this time based on their processing needs, this server-side setting acts as an initial threshold.

Recommendations

Here are a few recommendations to help you improve the dissemination of messages from your subscription:

  • Reduce the message backlog: You can do this by increasing the number of subscribers, optimizing the processing time of each message on the subscriber side, or using another message queuing system to buffer messages before they are delivered to subscribers.
  • Increase the flow control limits: You can do this by increasing the maximum number of outstanding messages or the acknowledge deadline for each subscriber.
  • Monitor the delivery of messages to your subscribers: You can use Google Cloud's Monitoring and Logging tools to monitor the number of messages in each state (active, delivered, expired, etc.) and identify subscribers that are having trouble keeping up with the rate of message delivery.

Re: your answer to 1, will pub/sub throttle the delivery rate even if I have extra servers/subscribers available for my subscription? I.e. I'm sending out bursts of 3000 messages every N minutes (after the prior batch of 3000 have been handled), but I have 3300 streaming pull connections/servers available (all of which are able to handle one message at a time), shouldn't all the messages be disseminated before pub/sub is able to adjust the delivery rate?

Certainly! Here's a revised version of your response:


Yes, Pub/Sub does manage the rate of message delivery based on several factors, even if extra servers/subscribers are available. Here's a breakdown:

  1. Message backlog: A significant backlog means there are many messages waiting to be delivered. Pub/Sub aims to distribute these messages evenly among subscribers. However, having a backlog doesn't inherently lead to throttling. Instead, if subscribers are slow to process and acknowledge messages, causing the backlog to grow continuously, then throttling may come into play to prevent overwhelming subscribers.

  2. Flow control and subscriber processing time: If subscribers take a long time to process messages or if the flow control settings are restrictive, then the rate of message delivery might be affected. Pub/Sub uses flow control to prevent subscribers from becoming overwhelmed.

  3. Internal factors: While Google Cloud Pub/Sub is designed to scale and handle massive volumes of messages efficiently, occasional internal latencies or optimizations can affect the immediate dissemination of messages.

Given your scenario, where you're sending bursts of 3,000 messages but have 3,300 available streaming pull connections, it would be atypical to see significant delays in message delivery.

Recommendations:

  • Reduce any existing message backlog: Increase the number of subscribers or optimize the message processing time on the subscriber side.

  • Adjust flow control settings: You can modify settings like the maximum number of outstanding messages or the acknowledge deadline to better suit your subscribers' processing capabilities.

  • Monitoring: Use Google Cloud's Monitoring and Logging tools to keep an eye on message states and to identify any potential bottlenecks or issues with subscribers.

i've done all 3 of the recommendations, what else can i do?

is there any way I can adjust the delivery rate on the subscription side? i.e. force it to not throttle

If you've already implemented the primary recommendations and continue to experience consistent and significant delays in message delivery, contacting Google Cloud support might be the next logical step.

While you cannot directly prevent Pub/Sub from managing delivery rates, there are steps to optimize message throughput:

  • Adjust Flow Control Limits: Increasing the flow control limits for your subscription allows subscribers to pull more messages. However, ensure that subscribers can process these messages in a timely manner to prevent potential redelivery scenarios.
  • Consider Additional Buffering Mechanisms: Introducing mechanisms to buffer or batch process messages before they reach Pub/Sub subscribers might help manage bursts of high-volume messages. Remember, Pub/Sub is already a robust messaging platform, so this would be for unique scenarios where additional pre-processing or buffering is beneficial.
  • Proactively Monitor Message Delivery: Utilize Google Cloud's Monitoring and Logging tools to keep a close eye on message states. By identifying potential bottlenecks or subscriber issues early, you can take corrective actions before they escalate.

Lastly, it's important to remember that Pub/Sub's management of delivery rates is designed to ensure system stability and prevent subscribers from becoming overwhelmed. Balancing message throughput with subscriber capacity is crucial to prevent issues like excessive message redelivery or potential message loss.

that makes sense. I noticed that the issue stops showing up when I disable exactly once, any ideas why exactly once could be causing problems here?

"Exactly once" delivery ensures that messages are delivered to subscribers precisely one time, providing guarantees against message duplication. This feature, while powerful, introduces additional complexities to the message delivery process.

The overhead from "exactly once" semantics doesn't necessarily cause Pub/Sub to throttle delivery, but it can introduce latencies due to the acknowledgment requirements and potential redelivery of unacknowledged messages. If you observe improved delivery rates upon disabling "exactly once" delivery, it suggests that these inherent latencies might be contributing to the observed behavior.

For scenarios with bursts of messages, it's crucial to ensure not only rapid message delivery but also timely acknowledgment from subscribers. If "exactly once" delivery is essential for your use case, consider:

  • Adjusting flow control settings to better match your subscribers' processing capabilities.
  • Exploring buffering or batching mechanisms to manage high-volume message bursts more efficiently.

Lastly, consulting with Google Cloud Support can provide deeper insights and tailored recommendations to optimize your Pub/Sub configuration.