s2s and load balancing

12 posts / 0 new
Last post
Anonymous
s2s and load balancing

Hi,

I'm planning to put a TCP load balancer in front of multiple tigase instances. Will this work with random/weighted balancing? In particular, I wonder how dialback would work, if one of the tigase instances connects outbound and presents a dialback key, but then an inbound dialback verification connnection gets routed to a different tigase instance that is unaware of this key.

Would be good to have some explanation of how tigase s2s works in a load balancing situation and what the recommended setup is. Thanks!

The s2s in Tigase has been implemented in such a way to handle properly the case you describe.
Tigase cluster nodes can exchange s2s key and session id information to allow outbound s2s connection from a different cluster node than inbound.
It has been deployed on many installations and appears to be working without a problem.

justin

Great, I've set up two nodes behind a load balancer and it seems to work.

I am using tigase for s2s only, and I have a feature request that might make tigase even better for my use case: what about being able to configure a dialback secret in tigase so that any node is able to service a dialback verification request, but without the nodes needing to communicate amongst themselves? This could be used with cluster mode disabled. Then I could have a limitless array of tigase s2s nodes, and adding/removing nodes would not require updating the configuration of all the others.

Shared, predefined key would be convenient indeed but very insecure. The whole dialback concept fails then.
However, I might have a good news for you. You do not have to modify configuration for all nodes when you add a new node. If you want to connect a new node to running cluster you need to put there following parameters:

--cluster-mode = true
--cluster-nodes = list of cluster nodes
--cluster-connect-all = true

The last parameter is the key and you only need it on the new node you are adding to the cluster.

For the next Tigase version we plan to remove a need for the first two parameters altogether. Once the Tigase is in a cluster mode it will be able to auto-configure itself and find all active cluster nodes.

justin

That's great news.

Regarding a predefined key: this should not be insecure if it is a secret. The actual key sent over XMPP could be hash of secret key + remote domain.

Yes, but once intruder learns the hash, he has not problems to pretend to be the service provider for domain example.com. And Tigase has no other ways to validate it.
Because now, anybody from anywhere can connect saying I am from example.com, then it provides the has for the key. The other side server connects back to example.com and of course will get confirmation that the kay is valid. Now the attacker can send you packets and you believe it is from example.com.

justin

Indeed, there seems to be some replay attack potential with dialback keys. Isn't this already the case though? How does Tigase today know that an incoming verification request is not replayed? Maybe some expiration time information is encoded? Or the key may only be verified once?

A different key is generated for each connection. As far as I remember it is based on session ID which is generated new for each connection and is supposed to be unique.
Tigase also keeps separately keys for each domain, so if a key was generated for one domain it cannot be used for a different one.

justin

Hi, today I noticed an outbound stanza fail to deliver and this is what I found in the logs:


2012-05-11 19:15:28 Dialback.processDialback() FINER: The key is not available for connection CID: xxx, or the packet CID: xxx maybe it is located on a different node...
2012-05-11 19:15:28 S2SConnection.sendAllControlPackets() FINEST: Sending on connection: CID: xxx, null, type: accept, Socket: TLS: nullSocket[xxx] control packet: from=null, to=null, DATA=, SIZE=107, XMLNS=null, PRIORITY=NORMAL, PERMISSION=NONE, TYPE=invalid

I have two nodes, and I can see that the outbound s2s connection happens on one node, and then the inbound dialback verification connection happens on the other. The log snippet is from the node that got the dialback verify request. Tigase responds with invalid, causing the remote server to then respond with invalid to the first connection, and s2s fails.

Is there anything I should look at?

The only reason I can think of is that nodes are not connected or/and they cannot see each other. Are you sure they are connected and can communicate? Hostname settings are critical for the clustering to work.

justin

Yes I believe they are connected properly. At least, I don't see the messages in the logs about reconnection attempts. Usually I would see this if one of the nodes were down.

Hm, hard to tell from these 2 log lines. What version of Tigase do you use?
The most recent version is installed in production and is working OK.
If you enable debug logs for cluster package you should see in your logs entries from S2SConnectionClustered. In particular CheckDBKey and CheckDBKeyResult. Have you seen anything like this?