In an environment with heavy traffic, is there a way to prioritize "POST" requests before serving "GET" requests. Can this be done in API management layer ? or it is always first come first serve ? (ESBs have a way to set priorities)
Solved! Go to Solution.
This sounds like some sort of quality of service (qos) type feature? There are a few complexities here, let's think through some of the questions and consequences first and then look at some practical things you could do today:
Theory:
1. How is "heavy traffic" defined? Is there a specific rate?
2. What does prioritized mean? Are the GET requests dropped or are they paused until the POST request completes?
3. Related to above, how long does the POST take to complete?
4. Is there some point where POST requests need to get throttled as well, for example is it conceivable that you end up with enough POSTs that no GETs can be processed in some given timeframe or should there always be some proportion of GETs being processed.
5. Somewhat related to the above, does any POST always beat a GET, or is it something that becomes important only when the ratio of POST to GET reaches some threshold
I'm sure there are other things to consider there, I'd be interested in other's thoughts on this.
Practical:
Today there are some ways that this could be done already but there are some limitations too. The clearest limitation is that the API Proxy doesn't have a quality of service policy. So anything we do now involves combining other existing policies to get a rough equivalence.
Another limitation is that there isn't the concept of a "pause" on a request, i.e. a temporary buffer that can store incoming requests. If something like this existed, it could very quickly become a problem with maintaining a growing number of open connections if the rate is high.
That said, I believe you could implement an approximation for this in the API Proxy today using quota policy or spike arrest. You set up the policy to only apply to GET requests and it would reject GETs above some predefined limit. This is a simplified example that applies the limit at all traffic levels.
You would need to make an estimate for the maximum number of simultaneous POSTs you were likely to get and combine that with knowledge of how long the POST requests take to respond so that you can work out what the maximum rate of requests should be in order to maintain capacity to process the POSTs.
Making it more dynamic, for example so that it only activates the quota/spike arrest when the rate of GET exceeds some upper limit might also be possible but would require additional policies to track current traffic rates and set some additional variable that you then put into the condition on the spike arrest/quota policy as well.
Also, each time you add a policy you will be adding a bit more latency to the request/response processing so that does need to be balanced off against the benefit.