Hi,
I need help understanding HornetQ client fail-over (stand alone client and stand alone server)
I would like to have setup with three HornetQ Servers in a cluster, without backup servers or replication - just simple configuration: when one server is dead client should not use it and simply use other still running servers from cluster.
I've tried example that comes with HornetQ: jms/client-side-load-balancing. It starts three servers: 0, 1 and 2. All of them are in the same cluster/group.
Test client obtains ConnectionFactory trough JNDI Lookup, it contains only address of the Server 0: java.naming.provider.url=jnp://localhost:1099.
Messages send from test client are load balanced with round robin trough all three servers - this is also expected behavior, because ConnectionFactory is cluster aware.
I've also configured client file-over without backup servers - only with such configuration:
<discovery-group-ref discovery-group-name="jms-discovery-group" />
<retry-interval>1000</retry-interval>
<retry-interval-multiplier>1.0</retry-interval-multiplier>
<reconnect-attempts>-1</reconnect-attempts>
<failover-on-server-shutdown>true</failover-on-server-shutdown>
Now my problem: when kill Server 1 or Server 2 everything works just fine - test client does not send messages to those servers.
When I kill Server 0 (configured in JNDI properties), the test client stops sending messages - it tries to reconnect to Server 0. I would expect it to use Server 1 and Server 2.
After some time test client gets exception:
Caused by: javax.jms.JMSException: Timed out waiting for response when sending packet 45
at org.hornetq.core.protocol.core.impl.ChannelImpl.sendBlocking(ChannelImpl.java:277)
at org.hornetq.core.client.impl.ClientSessionImpl.queueQuery(ClientSessionImpl.java:350)
at org.hornetq.core.client.impl.DelegatingSession.queueQuery(DelegatingSession.java:436)
at org.hornetq.jms.client.HornetQSession.lookupQueue(HornetQSession.java:1019)
at org.hornetq.jms.client.HornetQSession.createQueue(HornetQSession.java:390)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.springframework.jms.connection.CachingConnectionFactory$CachedSessionInvocationHandler.invoke(CachingConnectionFactory.java:344)
at $Proxy20.createQueue(Unknown Source)
at org.springframework.jms.support.destination.DynamicDestinationResolver.resolveQueue(DynamicDestinationResolver.java:101)
at org.springframework.jms.support.destination.DynamicDestinationResolver.resolveDestinationName(DynamicDestinationResolver.java:66)
at org.springframework.jms.support.destination.JmsDestinationAccessor.resolveDestinationName(JmsDestinationAccessor.java:100)
at org.springframework.jms.core.JmsTemplate.access$2(JmsTemplate.java:1)
at org.springframework.jms.core.JmsTemplate$4.doInJms(JmsTemplate.java:545)
at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:466)
... 31 more
Caused by: HornetQException[errorCode=3 message=Timed out waiting for response when sending packet 45]
The solution for this problem is to configure all three servers in JNDI properties: java.naming.provider.url=jnp://localhost:1099;
jnp://localhost:1199;jnp://localhost:1299. With such configuration I can kill any server and test client is using only still running servers. However this configuration suggests, that I would need HA JNDI to solve my problem or configure JNDI properties to contain always all cluster members. On the other side ConnectionFactory should be cluster aware and should automatically disable "killed" servers.
Question: why client tries to reconnect to server from JNDI Config instead of using still running servers in cluster ?