Configuration & Administration

Expand all | Collapse all

Intermittent - Server not found in Kerberos database

  • 1.  Intermittent - Server not found in Kerberos database

    Posted 6 days ago

    Server not found in Kerberos database: Attempted to get ticket for HTTP@SERVERNAME. Ensure reverse DNS is correct.

    We're monitoring 17 Windows Servers right now and seeing this error intermittently. I've stepped through all of the troubleshooting docs and posts I could find, but nothing seems to work. That same server will show that error, but periodically through the day info and events will come up. So I know it's working, it's just not consistent. At any give time I'll see the same error on 3-4 servers, but the others are all reporting fine.

    Also seeing random occurences of Windows Event Log collection failing. That also is working, because we see events for the server, but a lot of this error as well.

    WindowsEventLog: failed collection SERVERNAME

    We are running the latest version of Zenoss Core on a dedicated machine that meets the hardware reqs.
    Any ideas on what we can do to troubleshoot? If it was consistently not working, I'd imagine the config wasn't sound...but the fact that everything works (some of the time) seems to indicate some other type of issue.



  • 2.  RE: Intermittent - Server not found in Kerberos database

    Posted 6 days ago
    Edited by Jason Olson 6 days ago
    Hate to ask, but *does* the PTR record for the server exist in DNS? Your post doesn't say, nor do you give the version of Zenoss in use. There were a fair number of bugs in earlier versions, a lot of which have been dealt with in version 6.2.x+

    ------------------------------
    Jason Olson

    ------------------------------



  • 3.  RE: Intermittent - Server not found in Kerberos database

    Posted 6 days ago
    I have worked through this one as well.  I found the below configuration property in the guide and just upped it to 2 from the default of 1.

    zWinRMKRBErrorThreshold

    Having a poor network connection can cause erroneous kerberos error events to be sent which could cause confusion or false alarms. The default value is 1, which will always send an event on the first occurrence of an error. You can increase this value to send an event only when there have been x amount of occurrences of an error during collection, where x denotes the threshold number.


    ------------------------------
    Eric Ward
    Sys Admin
    Restaurant Technologies
    mendota heights MN
    ------------------------------



  • 4.  RE: Intermittent - Server not found in Kerberos database

    Posted 5 days ago
    I thought zWinRMKRBErrorThreshold might be involved as well. I've bumped that all the way to 10 on a few of the affected machines and haven't seen it make a difference. I tried increasing the zWinRMConnectTimeout too, but that didn't work either.


  • 5.  RE: Intermittent - Server not found in Kerberos database

    Posted 5 days ago
    Unfortunately the PTR records do exist.

    We are running the latest version of Core (6.2.1 r218) and I did update the Microsoft.Windows ZenPack to the latest (2.9.2).

    I've also tried every combination of zWinRMKrb5DisableRDNS (at the /Server/Microsoft level), manually defining the zWinRMServerName (both FQDN, none, and ${here/titleOrId}), and checking/adding SPNs.

    I'd expect that if something was configured incorrectly that it would either work all of the time or none of the time. I don't understand why it's sporadic. Some servers model cleanly every time. Others are very intermittent.



  • 6.  RE: Intermittent - Server not found in Kerberos database

    Posted 5 days ago
    One other note...restarting the Zenoss server seems to clear a lot of these. Once the server restarts, it typically models most of the servers successfully for a short time.


  • 7.  RE: Intermittent - Server not found in Kerberos database

    Posted 5 days ago
    Edited by Jason Olson 5 days ago
    Another question; do you see any messages in the Event console saying something like "Missing counters in collection for xxx"? If so, that may be why you're missing data. I've found that if Zenoss trips over a failed collection, rather than handling it and carrying on, it throws an exception and halts collection completely and silently.....for a few hours. Then it does the periodic remodel of the servers and graphing and event log collection begins again.

    Are any messages like that seen? As well, has the krb5-workstation packages been installed on the host? While it shouldn't be needed for proper operation as that should be handled by the docker images.....I find that it's required for consistent operation for Windows monitoring.

    ------------------------------
    Jason Olson
    ------------------------------



  • 8.  RE: Intermittent - Server not found in Kerberos database

    Posted 2 days ago
    Nothing in the event console for missing counters. Typically I'll see the "server not found in Kerberos..." error along with a handful of actual events and sometimes the EventLog failed collection.

    I didn't have krb5-workstation loaded. I've just added that. I'll see if it helps.
    It's also confusing to me that some servers never have the issue. A few of my Windows servers have perfect monitoring and never miss a model.






  • 9.  RE: Intermittent - Server not found in Kerberos database

    Posted 2 days ago
    Once installed, you'll need to restart the Zenoss application for any Kerberos changes to take effect. If that doesn't help, can you post the Configuration Properties Windows section (with any IPs, hostnames and user IDs changed to similar but invalid values)?

    ------------------------------
    Jason Olson
    ------------------------------



  • 10.  RE: Intermittent - Server not found in Kerberos database

    Posted 20 hours ago
    Same behavior. I restarted the whole server.
    Here's the Windows section. I've tried different things in zWinRMServerName to no avail as well.




  • 11.  RE: Intermittent - Server not found in Kerberos database

    Posted 19 hours ago
    Try undefining zWinRMKrb5DisableRDNSzWinRMServerName (which I think is what's causing the issue; that should be defined with a string at the server level, not a variable at the container level), zWinTrustedKDC, and zWinTrustedRealm. Restart Zenoss within Control Centre, then give it an hour and see how it behaves?

    If you want leave the variables set as they are, though, try undefining only zWinRMServerName at the /Server/Microsoft level, then going to one of the servers causing grief and setting that variable with the server's fully-qualified domain name in the Configuration properties of that server, and see what happens after an hour or so?

    ------------------------------
    Jason Olson
    ------------------------------