Tigase Load Tests again - 500k user connections

I have had a great opportunity and pleasure to use Sun's environment both hardware and software to run load tests on the Tigase server for a couple of last weeks. Sun also ofered me something more. Their knowledge and help during the tests allowed me to improve Tigase and fine tune the system to get the best results. I would like to say great thank you to Deepak Dhanukodi, ISV Engineering from Sun who was a huge help.

Below is a description of the tests I run, environment and results I got.

Summary

I know summary should be at the end but I realize that many people may be interested in the results while not being interested in all the details. So here we go.

 

  • The main goal for all these tests was to run the Tigase server under the load of 500k active user connections with an empty user roster. This test was going to show how well the Tigase server handles huge number of I/O and huge number of user sessions on a single machine.

    Success! The Tigase easily handled the load with CPU usage below 30% and memory consumption at about 50% level. The test was so successful that we tried to run another similar test to get to 1mln online users. This however failed because client machines couldn't generate such a load.

  • Secondary goal was to run comparison tests with different user roster size for and user connections count above 100k to see how the roster size impacts the load and resources consumption.

    This test wasn't the kind of score max but still I think it is also a great success. At the roster size of 40 and above the Tigase server started to behave unstable. Long GC activities impacted overall performance and in some cases leaded to unresponsive service. More details below. I learnt not only that default GC is not a good choice for the Tigase server under a high load but also I found the best GC and GC parameters to get a stable service with even higher load than I planed before. The CMS GC is the one which should be used to run Tigase.

  • Max connections and roster with 50 elements was the last test I wanted to run. In most XMPP installations I helped to setup, the average roster size was just below 50 elements. So the goal for this test was to see how many connections the Tigase can handle with such a roster.

    300k user connections with roster size 50 is the result which is quite good. CPU usage was below 50% and memory consumption below 60%. We could certainly try to handle more connections. Unfortunately I have never expected that the system can handle more than 300k user connections with 50 elements roster so this is what I had in the database prepared for the test.

 

Testing environment

I had 12 machines to run my tests. One for the Tigase server, second for the database and 10 more machines to generate clients' load:

  1. Tigase server SPARC Enterprise T5220, 32GB RAM, CPU - UltraSPARC-T2 with 8 Cores and 8 threads on each core which gives 64 processing units, CPU Clock speed - 1165MHz, 146GB 10k HDD SAS and SCSI.
  2. Database server Sun Fire X4600, 32GB RAM, CPU - 2xAMD Opteron 854 with 4 Cores each which gives 8 processing units, CPU Clock speed - 2.8GHz, 73GB 10k SAS HDD.
  3. Client machines 10x - Sun Fire V20z, 4GB RAM, CPU AMD Opteron Dual Core 2.1GHz, 36GB 10k SCSI HDD.

Software used:

  1. Tigase XMPP Server 4.1.5 as XMPP (Jabber) server.
  2. TSung 1.3.0 as clients' load generator.
  3. MySQL 5.1.33 Community Server as a database and the configuration file.
  4. Solaris 10 Update 6 as OS on the server, Solaris Express Community Edition snv_110 X86 as OS on load generators.

 

Test types

There were 2 main types of tests I ran:

  1. Standard test when the user session was about 20 minutes length, arrival duration 60 minutes. This test was mainly to compare the server behavior with different user roster sizes. The maximum number of users' connections was tuned by adjusting connections rate. This was however limited by the database which couldn't handle load generated by connection rate above 0.0045 sec.
  2. Extended test similar script to standard one but the user session time has been extended by putting script body in a loop. This was done to get maximum possible number of user connections in the test to see how Tigase can handle that.

Tigase setup

Here is a complete description of the Tigase installation which was fine tuned to get maximum performance during all tests. Please note I am not the MySQL database expert and I couldn't get it working fast enough to not impact performance. Therefore the system was configured in such a way to avoid any writing to the database during the test.

The complete JVM parameters for the tests are:

-XX:+UseLargePages -XX:+UseBiasedLocking 
-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode 
-XX:ParallelCMSThreads=8 -Dfile.encoding=UTF-8 
-Dsun.jnu.encoding=UTF-8 -Djdbc.drivers=com.mysql.jdbc.Driver
-server -d64 -Xms28G -Xmx28G -XX:PermSize=64m 
-XX:MaxPermSize=512m

The Tigase server parameters:

--property-file etc/init.properties --test

The '--test' parameter only excludes offline messages plugin from the installation and also decreases default logging level. This is done to avoid delaying the server with any unecessary IO operation. During the tests Tsung sends lots of messages to online users. In the second phase it happens quite often that the message sent to online users is processed when the user actually is gone and then it goes to database to offline storage. This introduced long delays. Also heavy logging introduces significant delays too and impacts overall performance, therefore it is set to absolute minimum during tests.

The Tigase server configuration properties

config-type=--gen-config-def
--admins=admin@tigase.test
--virt-hosts = tigase.test
--auth-db=tigase-auth
--user-db=mysql
--user-db-uri=jdbc:mysql://192.168.111.32/tigasedb_20roster?user=tigase_user&password=tigase_passwd
--user-repo-pool-size=12
--comp-name-1=srecv
--comp-class-1=tigase.server.sreceiver.StanzaReceiver
#--debug=server
--monitoring=jmx:10000

A few notes to the parameters:

  1. The 'tigase-auth' was used as authentication connector. It uses stored procedures to perform all user login/logout actions. Normally these procedures also update last login/logout time. For this test however updating user login/logout time was removed from stored procedures to minimize database delays.
  2. Depending on the roster size a different database was used.
  3. Database connection pool of 12 was used for user data database. There was only a single database connection for user authentication connector.
  4. StanzaReceiver was loaded to run Tigase internal monitoring tools detecting system overload, threads dead-locks and other possible problems.
  5. Monitoring via JMX was enabled and the system was also monitored using JConsole.

 

The user roster

The user roster was either empty or had a fixed, the same size for all users. It was built in such a way that always exactly half of the buddies were online and the other half was off-line when the user was logging in. Later on the rest of buddies was logging in too so eventualy all budies in the roster were online during the rest of the test.

Tests and tests results

Basic tests

NameRosterSession lentghConnections rateMax connectionsCPU usageRAM usageTsung reportsComments
500kempty80min0.005 sec622kCPUMemoryTsung reportAttempt was also to get to 1mln connections. This however failed due to limitation on the load generating machines. They were maxing resources out over 500k connections.
300k*5020 min0.0045 sec*300kCPUMemoryTsung reportThe requirement was to keep user session within 20min so to generate more connections the new connections rate had to be changed. Unfortunately 0.0045sec rate was the highest the database could handle so the 300k was the test limit or the database limit, not the Tigase server limit.

* - the database limit.

Other tests

NoRosterSession lentghConnections rateMax connectionsTsung reportsComments
1.Empty20min0.015 sec>100kTsung reportDefault GC.
2.1020min0.015 sec>100kTsung reportDefault GC.
3.2020min0.015 sec>100kTsung reportDefault GC.
4.3020min0.015 sec>100kTsung reportDefault GC.
5.4020min0.015 sec>100kTsung reportDefault GC.
6.5020min0.015 sec>100kTsung reportGC Settings: XX:+UseLargePages -XX:+UseBiasedLocking -XX:+UseParallelGC -XX:+UseParallelOldGC -XX:ParallelGCThreads=32, this didn't help much. At certain load GC delays make Tigase unresponsive.
7.5020min0.0045 sec299kTsung reportGC Settings: -XX:+UseLargePages -XX:+UseBiasedLocking -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:ParallelCMSThreads=8, this is the secret formula. CMS GC is the one which works well with Tigase and offer stable service even under a very high load.
Application: 
Article type: 

Comments

Why such small rosters?

In real life users often have much much bigger ones (by the hundreds or more in some cases). That would be interesting to see how Tigase behaves. Also it would be useful to understand how Tigase handles large number of subscriptions to PubSub nodes.

From my experience, in real life installation average roster size is about 40, in rare cases between 45 and 50.

I agree there are users who have much bigger rosters, even thousands elements but on the same installation most of the users have roster smaller than 20 elements. As a result average roster size stays normally below 50 elements.

The plan for the test was to run 10 tests for the roster size from 10 to 100 elements. Unfortunately time constraints didn't allow to run me all the tests I wanted to run. I promise to include all 10 tests during next load tests.

As for the PubSub question. Of course PubSub tests weren't included in the test but what you mean by "understand how Tigase handles large number of subscriptions"? And, what you mean by "large number"? What number of PubSub node subscriptions is large for you?

Well regarding large rosters I was mainly wondering how Tigase would behave.

In regards to pubsub, the subscriptions mechanism is not that different from a contact roster but we've had issues with having to fetch the last published items and other various PubSub queries that end up with many SQL queries that usually take a longer time to respond and may timeout.

In other words the size and the complexity have an impact on the performances of our server (note that we are not using Tigase but we're reviewing it).

Cheers,

Testing the server for a very large rosters would be indeed interesting. I am not sure if this is any different testing 200 new connections per second with 50 roster each or testing a single user per second with 5k roster in size. The test needs to be run to see how it works. If you would like to run such tests I would be more than happy to help you.

There is certainly a limit to which the server can process packets. The limit is on the memory and on the CPU. You simply can't load more roster items than you have memory and you can only process specific number of packets per seconds per CPU (Core) on your machine. Simply to handle 100k users with 1k roster each you need machine with a lot of RAM and also if the new connections rate is too high the server would not be able to handle the traffic. Look, at the connection rate 100 new users per second you end-up with 100k just presence packets per second.

The Tigase has some mechanisms to prevent overloading the server if there is more load than it can process. In the worse case scenario it just drops packets, usually presence packets which is better than the server overload and broken server. But this is what the load tests are for. You know how much the server can handle and you should not put on the server more.

There is at least one Tigase deployment where most of users have rosters in size between 1k and 20k. It works well, however the number of concurrent users is not too high.

The subscriptions concept in PubSub is indeed very similar to Roster concept. There is even a XEP-0207 (Humorous) describing Roster implementation based on the PubSub. However in the Tigase the PubSub implementation is completely separate and different. Developers are different. The Tigase PubSub is optimized for large number of nodes and subscriptions. We are also now in process for implementing a new DB schema for Tigase PubSub to boost performance even more for large PubSub installations.

If you need any help with testing or evaluating the Tigase server, or if you have any questions please don't hesitate to contact me.

Thanks for the feedback. I'll start reviewing tigase and will keep you posted :)

Are you using several ip addresses ? IP has only 64k ports and I don't see other way to do that big number of active TCP connections on single machine.

On the server side you use just one port number, you don't need any more. Of course each TCP/IP connection requires one or 2 file descriptors but that is a different issue.
The problem is of course with generating the load. On the client side you indeed need many ports to open that many connections to the server. I had 10 machines to generate the load and I could generate up to 50k connections from each client machine.

Re: large pubsub capacity, for me I'm interested in support for the number of subscribers to any node being over 500,000. I need presence info for each too. I'm also interested in any option for clustering or federating pubsub nodes to achieve many millions of subscribers to a given node. *note, for my current project I'm more interested in millions of publishers, and a few subscriber listeners, this (and other) case doesnt seem to be nicely dealt with by the pubsub spec though - can a server opt NOT to send notifications to publishers?

Our PubSub can work in 2 modes: using the Tigase server, general purpose data storage - UserRepository and then performance for large PubSub deployment is quite limited and the second mode using own dedicated database schema.
The second mode was introduced just recently for large deployments with performance in mind. It has been tested for installation with 1mln of nodes and 1k subscriptions for each node and 100 nodes each with 1mln of subscriptions.
The tests passed successfully and performance was very good.

Our PubSub hasn't been tested yet in the exact conditions you described in your post but I would be happy to assist you with such tests if you want to run them yourself. It would certainly be helpful if you gave us more details about your system. 'Many milions of subscribers' - how many milions? How many nodes with that many subscribers?

Our current clustering code for PubSub is not working very well and we are going to re-implement it very soon.

Does the second mode you mentioned in last post refer to the code in tigase-pubsub?
For the retrieving subscription (XEP-0060:5.6) and retrieving affiliation (XEP-0060:5.7) operations, the implementation seems to have to query all the nodes one by one as in RetrieveSubscriptionsModule.java and RetrieveAffiliationsModule.java. If that is the case, I am afraid the performance of those two operations might not be very well in the test case of 1mln of node. Is there any way to optimize that?

Hi Dayu,
The second mode refers only to the DB schema used by the Tigase PubSub, the code and the logic is exactly the same. The second mode uses DB schema optimized for PubSub data and PubSub queries.
Of course retrieving subscriptions or affiliations for ALL nodes, which is actually retrieving ALL subscriptions or affiliations make take some time for a large installation but this is rather related to the DB volume, not the implementation. Besides I do not really see any sense for such a PubSub query if you have let's sat 100mln subscriptions in total.

Have you used tsung to test the Multi-User-Chat for an XMPP server? I just found your website/xmpp server - wish I would have found it a year or so ago. We are very close to upgrading from jabber14 to ejabberd 2.x.x in a cluster and we are experiencing issues with the MUC testing.
Does Tigase support MUC (as a server, not joining another MUC environment hosted elsewhere)?

Unfortunately I haven't used tsung to test Tigase MUC implementation and to be honest Tigase's MUC is not extensively tested although I know of a few quite big deployments.
Yes, the Tigase does support MUC. There is a dedicated MUC implementation for the Tigase server.

Hi kobit,

>>--property-file etc/init.properties --test

Where should this line go ? Can you please update the "Tigase Load Tests again - 500k user connections" article regarding parameters to be used ?

Thank you

You can set this line either in etc/tigase.conf (TIGASE_OPTIONS) or (if you start Tigase differently) pass it as a command line parameter.

oh, that has slipped me. Thanks

About "The complete JVM parameters for the tests are:" params - I have been trying various combos between UseCompressedOops, UseLargePages and UseBiasedLocking (some on, some off) but every time I try running tsung against the tsung-prepared tigase instance I hit into

# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00002b104900f3fc, pid=18438, tid=1105111360
#
# JRE version: 6.0_25-b06
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.0-b11 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V [libjvm.so+0x4c93fc] void MarkSweep::adjust_pointer(unsigned*, bool)+0x1c
#

I have already updated java to latest in repo:

[root@XMPP tigase]# java -version
java version "1.6.0_22"
OpenJDK Runtime Environment (IcedTea6 1.10.6) (rhel-1.25.1.10.6.el5_8-x86_64)
OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)

Any hints on this one ?

PS. the linux system runs on top of a virtual machine (KVM). Should that be problematic ?

The problem is with: MarkSweep garbage collector, as it is indicated in the dump description. Unfortunately this is commonly known problem with JVM. More over, this happens on some Linux distribution while does not happen on others. I guess this might be related to Linux kernel used or some specific settings.

To address this problem I suggest to look at our documentation related to the problem: JVM Crash
There are a few options provided to solve the issue.

Also, I would suggest to use SUN's (Oracle) JVM as this proves to be more reliable and offers better performance.

update - have switched from openJDK to latest JRE from SUN's website (1.6.0_31) and still the problem remains :|

allright, I'll do like that, thanks

The Sun's JDK does have the same problem indeed.
We have run into this problem using SUN's JDK, so have a look at the link: JVM Crash as there are a few hints to solution.

Interesting tests. One thing that to point out is the Average response time shown in the Tsung graphs. In many of your tests, you have a max 10sec mean of several hundred seconds (400 seconds for your test that got several hundred thousand connections).

While I understand that this test is purely for number of connections, at a certain point those connections are useless because the user would have disconnected by this point.

I'm curious how many connections + roster connections you can get before the response time drops below a "user acceptable value", say 1-10 seconds. I know there's a lot of variables there, but this one seems to be the most valuable in terms of real world usage.

This is a good point. The problem with long response time on Tsung side was related to the fact that Tsung machines were overloaded long time before we reached the load test target.
That's our main issue with load tests we run. We would run more and more comprehensive tests but we do not have access to enough hardware on daily basis. Especially problem jest with the load generators, that is Tsung. It usually requires much more resources than Tigase.