Clustering with Tigase 4.2

Submitted by sukhi (not verified) on Wed, 2009-10-07 13:09
::

Hi Kobit,

Hope you are going good!

I am using tigase version 4.2 and trying to set up a cluster between two machines. I have few doubts regarding same and would appreciate if you can answer them:

1.) In a cluster, is it mandatory to have session manager on one machine and c2s,s2s on other machine or I can have
-gen-config-all on both machines.

As of now I have same following configuration on two machines(mc1 and mc2):
config-type = --gen-config-all
--cluster-mode=true
--cluster-nodes=mc1,mc2
--cluster-connect-all=true
Will it work? I get an exception when mc2 tries to connect mc1 on port 5277 i.e Cluster Component I believe.

2.) I have also written a custom component. Do I have to do anything special to make it run in cluster? Session Manager, C2S and S2S are default ones.

Also, is there any example configuration which I can use to see it working?

Cheers

kobit's picture

Hi Xue, what you did here is

Hi Xue, what you did here is redirecting all user packets to a session manager on the other cluster node.
While this works for sure, it is not a proper solution. As you pointed it out, it gives you only one working session manager which is vulnerable and not scalable.
But by this test, at least you have proven that both nodes are connected to each other and they communicate properly. In such a case network configuration is most likely correct and is not a source of your problems.

In my opinion the most likely cause of this is the fact that each node may connect to a different database. As I can see from the init.properties file you posted some time ago, database connection string points to the 'localhost'. So each node connects to a database on the local machine. Unless the database is clustered too, they use a separate databases, hence this can not work properly.

hi, I did connect to same

hi, I did connect to same database.

And I did put hosts entries in both servers' /etc/hosts files, i can ping each other using the hostname in the console.

I did some experiment, if i change the second nodes' config for routing

 <node name="routings">
   <map>
    <entry value="sess-man%40host2" type="String" key=".+"/>
    <entry value="true" type="Boolean" key="multi-mode"/>
   </map>
  </node>

If i change sess-man%40host2 to sess-man%40host1, actually it works as I want, but the problem is this kind of setup a fully distributed cluster? this still requires a central Session Manager.

kobit's picture

At the moment there is no

At the moment there is no notification sent to any of users that some users went offline because of the cluster node failure.

I am thinking however about a way to implement it.

Hi One more doubt here

Hi

One more doubt here regarding federation between two tigase clusters.

Suppose our deployment consists of four tigase servers/nodes (A,B,C,D). A and B form one cluster (Cluster X) and C and D form another (Cluster Y). Now Cluster X and Cluster Y are setup to federate with each other. Basically there is a s2s connection between X and Y.

The deployment contains 4 users. User A connected to A, User B connected to B, User C connected to C and User D connected to D. All the users are added in each others rosters.

Now if one node from cluster X, say Node A goes down what will happen to the presence of User A as seen by his buddies (User B, User C and User D). Will his presence go offline as seen by User C and User D. Needless to say that Cluster X is still functioning since Node B is still up.

Thanks

kobit's picture

Hm, I am not sure if your

Hm, I am not sure if your question is related to clustering at all. I think it is more related to s2s connection between 2 servers and what happens if one of the servers goes down. Please note in some cases it is not even possible to determine that the server went down. Servers don't keep s2s connection between them all the time. The connection is dropped if it is not used for a while. Hence if one server goes down in this time there is no way of knowing that for other servers unless they attempt to open a new s2s connection.
What happens then, whether the server who stayed alive notify all its users about other server failure depends on the implementation. I think that notifying all its users that some server went down and buddies from this server are offline might be very expensive task and I don't think any implementation does that.

Of course, in case of the clustering environment, the case is a bit different. If one node goes down, there are other nodes which know what happened and could in theory notify other servers that some users got disconnected. Tigase doesn't do that. Again the task is very expensive in term of resources needed and I think it is not even worth the effort. If one node goes down, there are other nodes still alive. So in most cases users would just reconnect to other node and be online again. This normally happens within seconds so from the users on other servers point of view this doesn't matter.

Hi, I want to understand the

Hi,

I want to understand the HA & Load Balancing behavior of Tigase for Server to Server connections.

Suppose there is a multi-node (Clustering) set up of Tigase doing xmpp federation.

Let's assume that there are two nodes in Tigase set up "A" & "B'. Some users (UserA, UserB)are connected to node "A" and some users (UserC, UserD) to node "B".

There are federated connection to the remote server (another Tigase instance in clustering mode)from both node "A" and node "B'.

UserA, UserB, UserC, UserD are added in the buddy list of the remote users (Users on other instance of Tigase).

If node A of Tigase goes down, will the buddies on remote server see UserA & UserB offline?

Thanks,

kobit's picture

I guess you are missing a

I guess you are missing a few things here.
First you should be aware that the default configuration is most likely not suitable for clustered deployment.
Secondly all your cluster nodes must connect to the same database in order to function properly. Ideally this is a good SQL database like MySQL or PostgreSQL.

Your problems with users not seeing each other may be caused by 2 things:
1. The cluster nodes are not connected which is usually caused by incorrect network (DNS) configuration.
2. Both cluster nodes use a different database.

Please check both above and let me know if this solves your problem. If not please let me know, I will try to help you.

Hi, I have followed your

Hi, I have followed your suggestion to set up two nodes across two machines A & B.

However, if User XXX logins to A, and User YYY logins to B, XXX cannot see YYY's presence at all, and they cannot send message to each other.

And the default generated config is using Deby, not Mysql, i have to manually change myself, the virtual domain name also didnot generate properly in tigase-server.xml, i have to add manually.

Do I miss any important steps?

My init.properties is :
--cluster-mode = true
config-type = --gen-config-all
--cluster-nodes = XXX,YYY
--debug = server,xmpp.impl,db,cluster
--user-db = mysql
--admins = admin@XXX
--user-db-uri = jdbc:mysql://localhost/tigasedb?user=tigase&password=tigase12
--virt-hosts = XXX.abc.com
--comp-class-2 = tigase.pubsub.PubSubClusterComponent
--comp-name-2 = pubsub
--comp-class-1 = tigase.muc.MUCComponent
--comp-name-1 = muc
--sm-plugins = +jabber:iq:auth,+urn:ietf:params:xml:ns:xmpp-sasl,+urn:ietf:params:xml:ns:xmpp-bind,+urn:ietf:params:xml:ns:xmpp-session,+jabber:iq:register,+roster-presence,+jabber:iq:privacy,+jabber:iq:version,+http://jabber.org/protocol/stats,+starttls,+msgoffline,+vcard-temp,-http://jabber.org/protocol/commands,+jabber:iq:private,+urn:xmpp:ping,+basic-filter,+domain-filter,+pep,+zlib

kobit's picture

I am afraid I won't be able

I am afraid I won't be able to prepare "in-depth documentation" for the Tigase clustering any time soon. You can have a look at 2 presentations I have prepared for the Tigase clustering: Tigase clustering presentation video and Clustering in the
Tigase server PDF
. I hope, you would find there some useful information. If not, please come back to me with more questions.

Your questions:

ad. 1. At the moment the Tigase server does not load balance users itself. The simplest and the cheapest way to load-balance users among cluster nodes is via DNS round-robin. We use it on a few installations and it is quite effective indeed. The more expensive and more complex (not sure if better) way to do load-balancing is via using a specialised hardware routers. I know that Cisco and a few others do such hardware. If you are interested I can point you to someone who knows more details about it. We are also planning at some point to add load-balancing inside the Tigase server. No timeline for that yet, however.

ad. 2. This depends... From the version 4.3.1 the Tigase supports pluggable clustering strategies. So what exactly happens depends on the clustering strategy used. The simplest - trivial strategy works as follows: if the XMPP packet can not be processed (user is not connected to that node) on one cluster node (N1), it is sent to another node (N2) for processing. If the N2 node can not process the packet it sends it to N3 node. And so on... Eventually, either one of the nodes processes the packet or the packet is being sent back to the first node where it is processed as a packet to offline user.
This works quite well for low number of cluster nodes, up to 3. For more nodes this is not effective strategy as it causes lots of network traffic between nodes. There are different strategies possible and some of them are dedicated for specific deployments.

ad. 3. If some node goes down, then all users get disconnected from that node. This node is no longer part of the cluster, hence the cluster behaves like it has never been part of the cluster. Normally all users should reconnect to a different cluster node or stay offline. The packets to all these users are processed accordingly.

ad. 4. Yes. From the Tigase version 4.3.1 there is a dedicated API in the server for pluggable clustering strategies. In fact a dedicated strategy designed for a specific deployment can lead to huge performance improvements. So this is certainly feasible, however, from my experience I know that there is no ultimate clustering implementation for all kinds of deployments. For large deployments you have to look at the traffic shape, users distribution and a few other parameters and design the best custom strategy. Please note also that to get best results you can not have a universal strategy code for all XMPP components (Session Manager, MUC, PubSub, ....). Each component needs a dedicated strategy code to make sure optimal performance is achieved.

Certainly the clustering strategy API is one of the significant additions. Apart from that, clustering implementation for session manager has been reworked to make it more robust and reliable in case of communication problems between cluster nodes.

I hope this explanation sheds some light on your questions.

Hi Artur, If you can really

Hi Artur,

If you can really explain how tigase clustering works in detail, as i could not find any in-depth documentation about tigase clustering. e.g few questions which i have in my mind right now are:

1. how clients are load balanced across the different nodes
2. If there are 3 nodes N1,N2 & N3 what is approach taken when client connected at N1 sends a message to client connected at node N2.
3. If N3 node in cluster goes down then, how all the messages directed to clients connected to N3 are handled?
4. Can we really customize clustering to decrease latency, network traffic by implementing some kind of hashtable strategy, so that N1 directly talks to N3. Don't know how feasible can this be.

Also if you can shed some light on clustering changes you have made in Tigase 4.3 and how they enhance the performance & robustness of clustering.

kobit's picture

Hi Sukhi, The network setup

Hi Sukhi,

The network setup required for Tigase clustering, and the DNS configuration in particular is rather operating system independent, hence it doesn't matter what system you use (although I strongly recommend using Unix like systems over MS Windows for this kind of software).

Anyway, back to your question.

There are 2 types of DNS names used in your XMPP cluster:

  1. virtual host names this is a hostname (domain) visible to your users like jabber.org, company.com and so on. Your whole cluster, regardless how many nodes you have is visible to users as one server working for this virtual domain. This is defined by --virt-hosts property in the init.properties file. And you can have as many as you like virtual hosts for your XMPP installation. If you query DNS for your virtual domain it should return an IP address of one of the cluster nodes. Every time you query DNS it may return a different IP address (IP address of a different cluster node). This is your example.com
  2. real host names are names unique to each of the cluster node. They are not related to virtual names and they can have a form - node1.internal.net, node2.spare.internal.net and so on. These are your mc1 and mc2. They are normally not visible to your users and they are only used internally by the Tigase cluster nodes. The important thing is that if you query DNS for the node hostname mc1 it must always return one and the same IP address of the proper cluster node. This is what you put to --cluster-nodes property in the init.properties file.

The simplest way to check whether your virtual names are configured properly is to ping your example.com and see whether it returns an IP of one of the cluster nodes, ideally every time you ping it, it should display an IP of a different cluster node.

The simplest way to check whether your real names are set correctly is to call 'hostname' command on each node and see what it returns and then ping from one node the other node to see if it can be contacted successfully.

To check what the Tigase sees and think is the real hostname you have to look inside the tigase.xml file and look for string 'def-hostname'. This key points to a default hostname detected on the system. See whether it is your mc1 or mc2.

Thanks for the prompt

Thanks for the prompt reply..

"**..........that DNS is set correctly your cluster nodes hostnames**"

Could you please explain this network configuration bit more (Apologies if it sounds real stupid question, I am total novice in this area).

Here I have two machines lets say with hostname mc1 and mc2 respectively. I will specify these hostnames for key --cluster-nodes in init.properties.

Now I should have a DNS say example.com and when client connects through example.com, the DNS should be able to resolve mc1 or mc2. Correct?

We don't have to configure this DNS on tigase side. Would you be having any resources which I can follow to create such setup for Windows XP machine?

Would appreciate your help..

Cheers

kobit's picture

First I would greatly

First I would greatly recommend to use the last Tigase 4.3.1 version as clustering has been significantly improved in this version.

ad. 1) The Tigase supports full clustering for HA and LB which means all cluster nodes can have an identical configuration and when one nodes goes down, all other nodes can carry on providing full functionality. Therefore it is recommended to use --gen-config-def or --gen-config-all on all nodes. Please note, not all Tigase components fully support clustering yet - MUC, PubSub. For such components I suggest using virtual components stuff.

Your configuration is almost correct. You just have to remove line:
--cluster-connect-all=true
It is not needed in your case.

Also please note, network configuration is a critical issue for the Tigase cluster. Therefore you have to make sure that DNS is set correctly your cluster nodes hostnames. You have mc1 and mc2. You have to make sure that 'hostname' command on each node returns the same thing and that both names are resolvable by your DNS server.

ad. 2) Yes, you have to implement clustering for your component on your own as clustering is component specific thing. Each component does clustering in a different way. Alternatively you can use Virtual components stuff mentioned above to get your system working with components which don't work in the cluster mode.

What configuration example are you asking for?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Post new comment

The content of this field is kept private and will not be shown publicly.