I am trying to setting a resource (delegate) host with Zenoss 5. My master host has been setup for almost a year (upgraded to CC 1.2.1-1; Zenoss 5.2.1). I just built a delegate host using the instructions on Zenoss.com (CC 1.2.2-1), including setting up all the SERVICED variables. I can add the delegate host into CC on my master host and the delegate is able to authenticate, but it's not listed as "active".
When I go to move a service over to the delegate host it doesn't work. The health check keeps failing. When viewing journalctl on the delegate there are no logs indicating a problem or the service even attempting to move to the delegate.
Any thoughts on what might be the cause
I rebuilt my delegate host and tried adding it into control center and I still ran into the same problem.
I rebuilt my master host and added the delegate host to the fresh install and ran into the same problem.
The only log message I see that might pertain to this would be this sequence:
serviced: time="2017-02-13T22:09:27Z" level=info msg="Received new authentication token" expiration=1487027367 location="token.go:55" logger=auth
serviced: time="2017-02-13T22:09:32Z" level=info msg="Determined pool assignment for this delegate" hostid=853e7002 location="daemon.go:814" logger=cli.api master=":4979" poolid=default
serviced: time="2017-02-13T22:09:38Z" level=info msg="Updated master with delegate host information" hostid=853e7002 location="daemon.go:826" logger=cli.api master=":4979" poolid=default
serviced: W0213 22:10:27.935340 11336 connection.go:257] timed out waiting for connection
I'm also upload a picture of what I'm seeing in Control Center.
Both system are on the same subnet. That being said, here's a port scan:
PORT STATE SERVICE
22/tcp open ssh
53/tcp open domain
80/tcp open http
111/tcp open rpcbind
443/tcp open https
2049/tcp open nfs
2181/tcp open unknown
4242/tcp open vrml-multi-use
4979/tcp open unknown
5000/tcp open upnp
5042/tcp open unknown
5043/tcp open unknown
8443/tcp open https-alt
20048/tcp open unknown
22250/tcp open unknown
42710/tcp open unknown
50000/tcp open ibm-db2
53651/tcp open unknown
These are built as VMs. I built another delegate host on the same virtual hostas the master host, connected to the same vSwitch, and I still have the same problem.
I figured it out on my own.
The documentation is missing a couple of variables that should be setup, even in a two host setup:
I had setup the SERVICED_MASTER_IP out of common sense. I had to also setSERVICED_ZK for the host to become active. This isn't stated in the documentation.
Trial and error based on educated guesses.
I worked my way to the conclusion that the issue was with serviced since the host was authenticating and ports were definitely open. I just started going through serviced variables I thought might contribute to the issue, usually skipping over the ZK variables because I'm not using ZK.
Obivously I ended up changing that variable. The default value lists "SERVICED_MASTER_IP" so I gave it a shot.