AnsweredAssumed Answered

OpenFire Hazelcast can't handle fast restart of node, but ok if slow

Question asked by Nathan Neulinger on Aug 12, 2016
Latest reply on Nov 2, 2016 by Nathan Neulinger

Sharing this to hopefully get an explanation as well as to benefit other users if they see the same behavior.

 

Setup:   Three 4.0.2 nodes with Hazelcast 2.2.0, backend is a three node Percona XTraDB MySQL cluster.

Front end loadbalancer is currently a F5 LTM. Individual servers are set with a static list of member nodes with unicast setup.

 

If I do a fast restart of any given one of the nodes - i.e. just go stop and immediately restart, it comes up and is unable to properly join the cluster. In fact, it winds up coming up claiming that clustering is disabled.

 

If on the other hand I do the restart slowly - stop, wait a minute (I _suspect_ it's actually 30-seconds as the cutoff), and then restart --- all appears to be good.

 

Is this just expected behavior, or is it something that is tuned incorrectly on my side?

Outcomes