środa, 12 czerwca 2013

Limiting outgoing connections by max TPS

Interesting problem showed up. A customer requires that we never exceed some agreed number of TPS against his service (SOAP WS). On our side we've got couple of instances generating the traffic. They  in fact act as a pipe, processing the incoming requests. It seems a bit tricky to get all the instances to communicate with themselves extremely quickly to exchange the TPS information each of them has (sure, can be done - and I would use ØMQ for that if needed).
To sum up -  we have no influence on the incoming load, which comes from another customer (it's Google, actually ;)). Still, we have to stop all our requests to be forwarded. We need to start rejecting them above the threshold - queuing waiting for the less load would leave us with growing latency and with exhausted http connection pool in front of our app.
So, the idea is to use (already existing) reverse proxy  between the app and the outgoing customer's service. The xinetd's config comes in handy (from man page):

    cps

Limits the rate of incoming connections. Takes two arguments. The first argument is the number of connections per second to handle. If the rate of incoming connections is higher than this, the service will be temporarily disabled. The second argument is the number of seconds to wait before re-enabling the service after it has been disabled. The default for this setting is 50 incoming connections and the interval is 10 seconds.

The plan is to set it up to 'cps 29 0' (the goal is never exceed 30TPS) and see what happens :)

Different setting for the interval would result, in case of constant traffic like 31TPS, with loosing all 31 requests every 2nd second. Not good.

OK, let's see if the interval '0' works as expected... I'll update with results.


UPDATE:
finally, I have found  xinetd useless for the task. The thing is that (just took a look into the source code), when cps condition is reached, the service is stopped. And by this I mean that it is literally totally stopped... it rejects everything.  Actually, with that knowledge and reading the above spec now I see it states exactly that :) But still - it was surprising - I was rather expecting it to stop accepting more connections,  not to drop all undergoing work.

Well... the final solution is the most simple one: the TPS is monitored by the app itself. It is running on 2 separate instances, not aware about each other.  Both are set to work with max 14TPS. Of course it won't ever meet the max 30TPS, because it now all depends on the load balancer before the apps... Anyway, it is a conscious decision of the customer which claims that it will never reach so high rate anyway, so why should I care more than them? "Customer is King" :) - I did my job clearly explained all the consequences.

I have just used the com.google.common.util.concurrent.RateLimiter (from Google's Guava lib):

public synchronized static void checkIfBelowMaxTPS(final FRBYGConfig config, final String provider) throws MaxLimitProcessingException {

        RateLimiter rateLimiter = provider2RateLimiterMap.get(provider);

        //--- just for a double-check/logging;
        long now = System.currentTimeMillis(); //watch-out - different precision depending on OS (e.g. 10ms steps)
        requestsSet.add(now);
        Long oneSecBefore = now-1000-1; //-1 is because the subSet() is exclusive
        SortedSet<Long> subset = requestsSet.subSet(oneSecBefore, Long.MAX_VALUE);
        int noReqWithin1Sec = subset.size();
        requestsSet = subset;
        //---

        log.info("checking if the TPS is not above the treshold (provider="+provider+", limit="+rateLimiter.getRate()+"tps" + " - the current requests rate is: "+noReqWithin1Sec+"tps");
        if (!rateLimiter.tryAcquire()) {
            throw new MaxLimitProcessingException(...);
        }
    }


Also, just for a double check and logging purpose (RateLimiter doesn't provide runtime stats usable for logging) I added separate calculations there. It usually agree with RateLimiter, but is not 100% accurate. I have decided to trust RateLimiter in this task though.


Brak komentarzy:

Prześlij komentarz