Presence queue?

24 posts / 0 new
Last post
Anonymous
Presence queue?

I noticed that since 4.2.0 each component got an extra PRESENCE queue which has even lower priority then the LOW queue.

What is the rationale behind this change?

Application: 

There can be lots of the presences in the system, and by lots I mean really a lot. If the average user has a roster with 150 contacts and you have 100k online user, each of them changes status once a 1000 seconds then you may need to process something like 15k presence packets per second.
This is quite a high traffic to handle. I consider presences as the lowest priority packets which if lost no significant harm is caused. Of course this loss is possible only in case of the system overload when all queues are full and the system can not keep up with the traffic.
Thus presences have own, the lowest priority queue.

If the system is not overloaded then everything is processed normally, of course the lowest priority packets are processed when there is no higher priority packets waiting.

justin

I am having a problem where if I send <presence to="my_component"> followed by <iq to="my_component"...>, the iq might be received by the component first. I am not sure where the fault lies, but I am wondering if it is tigase doing prioritization of stanzas and reordering them?

My component requires you to have sent it presence first, before certain iq requests are allowed.

Yes, this might be the case. Sometimes presence stanza can be taken over by other stanzas. Do you expect to receive any kind of response from the component for your presence stanza? Maybe you should wait before you send the IQ?

justin

There is no ack for the presence, no. I suppose I could have an ack to it, probably some sort of response presence. Maybe I will do this as a work around.

But... XMPP is supposed to deliver stanzas in order so I want to say this is wrong behavior by tigase.

justin

By the way: we mix other stanza types too. For example, an iq-set to request lots of data, which is then followed by an iq-result acknowledging the request and then a series of message stanzas carrying the real data. It is important that when the component replies with the iq-result and the message stanzas, that the clients receive iq-result first, otherwise the message stanzas won't have context and would be dropped by the client.

So really, we depend on in-order delivery across all stanza types, not just per-type.

No workorunds please. I have an idea how to solve it properly but it needs someexplanation. I am on mobile now so I will describe it tomorrow.

Back to your original problem - stanza reordering. It can in theory happen but usually it does not happen as user don't do things which can trigger this problem.

The thing is that different kind of packets cause a different load on the server and also their importance is different. For example user can normally accept that they did not receive status update from some of their friends but they won't accept if the message is lost.
And, because the status update packets can generate huge traffic on the server, they can overload the server queues and packets start to be dropped. This can normally happen on a large installation with a very high traffic. So to reduce the impact it can cause the Tigase has a few different queues for packets and each queue has a different priority.
All the iq, message and some presence packets have normal priority and all status update presence packets have lower priority. (There are also other queues for system and cluster packets.) Service disco and statistics have high priority.
Therefore a high volume of status update packets don't affect flow of any other data. It may even happen that the low priority queues are overloaded and packets are dropped but still all other packets are delivered very quickly without any delays. Low priority queues are processed only if all higher priority queues are empty. All what happens is that some presence updates may be missing or delivered with delay.
This is however a better than the total server overload and failure with OOM.

Of course this causes a side effect that message and iq packets may take over the presence updates. And this sometimes happens on kind of automated systems where there is a code generating traffic instead of people.

So the simplest way to make sure that there is no reordering is to send only packets with the same priority. They all get to the same queue and they all are processed sequentially. If for some reason this is not possible then it might be necessary to add an option to the Tigase to not use priority queues at all. However this is not a good solution for a high traffic installations for reasons described above.

justin

Your explanation makes sense. I wonder if there is a workable solution that does not break spec. Stanzas only need to be processed in order between the same to and from JIDs, so you could prioritize message or iq traffice between one to+from pair over the presence of a different to+from pair. But maybe this is really difficult.

Anyway, since priority queues in tigase are not optional, it sounds like I still have to perform a workaround, right?

I will try to find solution today on the train on my way back from fosdem. If this change is possible to code within the time I will add it to the next version release and let you know all the details.

Justin, I have just committed code which allows you to use non-priority queues in the Tigase server.

Hopefully this might solve your problem with packets reordering. Unfortunately I have no easy way to replicate your problem so this would just your task to make sure that the changes I have made really help.

To activate non-priority queues you have to add following line to the init.properties file:

--nonpriority-queue=true

Please let me know if that helped. I have run all the functional automatic tests on the code with both types of queueing system and all tests passed. I do not know, however what are the performance implications yet.

justin

Is this available in the snapshots?

Sorry, I thought you use Tigase from sources. I have just generated a new snapshot for the Tigase server. Please pick the last one and the functionality should work.

justin

I am using b2135. It seems --nonpriority-queue=true does not actually work. We are still receiving stanzas out of order. Specifically, if presence is sent to an entity immediately followed by iq, it is often that the iq is received first. I can confirm in the tigase logfile that the stanzas are received in the correct order from the client connection but then delivered in the wrong order to the destination component.

I see "priority=PRESENCE" in the logs next to presence packets, so maybe this is not an algorithm bug but simply that the "--nonpriority-queue" option isn't wired up properly.

Justin, the 'priority=PRESENCE' means that such a priority has been set for the presence packet. It doesn't say anything about the queue used.
Unfortunately using non-priority queues does not give you 100% guarantees that packets arrive out of order.

In the session manager every type of stanza is processed by a separate plugin. Each plugin has own thread pool. Therefore, it is possible that if IQ processing takes more time than presence processing, the presence packet takes over the IQ which was sent earlier.

I am afraid there is no simple solution to this. We can't have a thread pool on per-user basis (instead of per-pluging basis) because all 'slow' packets would slow down the whole server and would block all threads in the pool.
Imagine, authentication packets for example, which are probably 0.01% of all traffic take 100 or maybe 1000 time longer to process than most of other packets because they need DB access. If we use per-user thread pool, then we can end-up with a situation where all threads wait for DB response, while queuing long line of other packets (messages, presences, etc...). Current implementation allows to process 'slow' packets by one thread pool while all 'fast' packets are processed by another pool.

I know this is sometimes inconvenient and not fully 100% RFC compliant but this is the only way we can handle really high traffic services. I don't know your system at all, but maybe you could way for IQ 'result' before you send presence packet?

justin

I'm not sure I buy that this is the "only way". Clearly if each stanza type is delivered in-order among others of the same type, then getting all types to deliver in-order should just be a matter of changing from 3 queues to 1. Or is it mere luck that even stanzas of the same type (such as messages) stay in order?

No, this is not a luck that stanzas of the same type stay in order.
I don't really want to get into all the technical details, this might be a very long post. The thing is that in the Tigase there are lots of different queues and threads pools on different phases on the packet processing.
For example, every component has own queues plus threads pool, also every plugin has own queues plus threads pool.
While you can switch off priority queues, thus all packets are kept in a single queue, they are still processed independently by different plugins and each plugin has own threads. Therefore "fast" packets may sometimes take over "slow" packets.

justin

Is it possible to have multiple plugins for the same stanza type? Or do plugins only match on stanza type, and there is a limit of 1 per type?

I am just concerned that you could have message delivery order problems, if for example there were two plugins processing message stanzas, but matching on different sub-content of the messages.

There is no such limit. You can have many plugins processing the same exact stanza and even matching the same exact sub-content of the stanza. The best example for this is presence stanza. It is processed by roster-presence plugin and off-line messages plugin.
This is why it is so important not to modify the packet itself while it is being processed because it can be processed in other places at the same time.

justin

A simple (but not necessarily performant) solution would be to have a single plugin then, for processing all stanza types. It could be made optional, and would suffice as a stop-gap.

I am disappointed that there is not an easy way to make this work, nor do you seem much concerned about it. In-order delivery is an important feature of XMPP, and offering this slightly-degraded experience will cause developers to not be able to depend on it, which hurts the protocol.

Since I care about how XMPP (and, by extension, tigase) is viewed, I'm going to work around this issue in my component code so that the problem is no longer perceivable. I'm doing this by collecting stanzas in a queue with the assumption that they are probably out of order, then sort and process them in order. This way there is no change to protocol or client code needed.

Justin, I really appreciate your comments and your input, even though I do not fully agree with you :-).

  1. Your solution - single plugin for all stanzas - would not help at all. Each plugin can produce results (stanzas) on it's own. Therefore having one central plugin processing all stanzas does not enforce packets to stay in order.
  2. There is an "easy way" to make it work but it will hurt the server performance. So, actually there is no easy way (or I just did not found any yet.) to make it work and still have high performance server under very high load. Any suggestions are very much appreciated.
  3. I am not, indeed, very concerned about it, because, in my opinion this is not a big issue (but it still matters to me). Messages are delivered in order, presence plus roster requests are delivered in order. Any stanzas types on their own are delivered in order. In most cases presence taking over message does not matter that much, or IQ packet taking over presence does not matter that much, either. Especially that IQ works in a way: request (get or set) - response (result). So the developer knows when the IQ has been delivered and processed and it can synchronise on this other packets and actions. Even though, delivering all stanza in exact order is not the highest priority to me I am still looking for a way to implement it in such way they are delivered in order. However, it must not hurt the server overall performance.
  4. Any ideas or suggestions are really welcomed...
justin

If stanzas are guaranteed to be delivered in order, per-plugin (as you say, message stanzas alone are delivered in order just fine), then why is a single plugin for all stanza types not a solution? You say "each plugin can produce results (stanzas) on its own" as a reason that order cannot be maintained when there is one plugin. Okay, but why does this problem not cause stanzas (for example, messages) to be delivered out of order when there are three plugins?

Maybe I misunderstood you about the single plugin approach. If you had one plugin only, which handles all stanzas then yes, this would make all stanzas fly in order. This is possible to do and this is actually the solution I had in my mind saying "there is an easy way". You could actually create and load single plugin which would call all other plugins to actually process the stanza. I think, however, this might significantly affect performance.

I thought, that maybe you suggest to add one more extra plugin to all existing plugins which would handle all stanzas but simply do nothing. In such a case this would not prevent stanza reordering.

justin

Okay good, now I don't feel crazy.

Obviously a fix that performs well would be preferred.

Here is something to think about: in-order delivery only needs to be maintained between each send/recv JID pair. This means that it may be possible for Tigase to have even more parallelism than it does now, because it might maintain lots of stanza ordering when it really doesn't need to. For example if I send a message to two different users in a row, it is not important that a message is routed to the first user before it is routed to the second user.