ZenPacks and JSON API

Expand all | Collapse all

Datasources being disabled - ESXiMonitorPython / Python Collector

  • 1.  Datasources being disabled - ESXiMonitorPython / Python Collector

    Posted 02-05-2018 05:07 AM

    We're using the ZenPacks.community.VMwareESXiMonitorPython 3.0.3 (Jane's update to the perl sdk version of the zenpack) to monitor various esxi hosts.

    We're occasionally seeing the python collector event being raised whereby it disables the datasource as it thinks the datasource is blocking for too long – as described in a couple of old threads:

    https://community.zenoss.com/forum/community-home/digestviewer/viewthread?GroupId=7&MID=1538&CommunityKey=bf8b1900-b44f-44c3-8287-3b60f8023cf4&tab=digestviewer

    and

    https://community.zenoss.com/forum/community-home/digestviewer/viewthread?GroupId=7&MID=1538&CommunityKey=bf8b1900-b44f-44c3-8287-3b60f8023cf4&tab=digestviewer

     

    When we see the event raised it is usually when hosts are being shutdown / started up (although not every time).

     We've been experimenting with upping the blockingtimeout value to a large value (100+ seconds) to prevent this or setting the blockgintimeout to 0 to prevent the blocking watchdog from being started in PythonCollector...

     but were wondering if anyone else has noticed the datasources from this ZenPack being blocked or any advice on how to avoid it?

    Thanks for any advice!



    ------------------------------
    Pheripheral

    ------------------------------


  • 2.  RE: Datasources being disabled - ESXiMonitorPython / Python Collector

    Posted 02-06-2018 02:53 AM
    Yup - I have also seen this, particularly in the circumstances you describe.  For what it is worth, I have my zenpython.conf with blockingwarning at 3 seconds and blockingtimeout at 10.

    I have a feeling that later versions of the PythonCollector ZenPack may also help - I am on  1.7.3 which is quite old now.

    Cheers,
    Jane

    ------------------------------
    Jane Curry
    Skills 1st United Kingdom
    jane.curry@skills-1st.co.uk
    ------------------------------



  • 3.  RE: Datasources being disabled - ESXiMonitorPython / Python Collector

    Posted 02-07-2018 09:28 AM
    OK, good to know others have seen it.

    We're now on Python Collector 1.10.1 but still seeing it with default timeouts.
    Currently assessing outcome of upping timeouts over a few weeks and seeing if the event reappears.

    Thanks
    Dafydd


  • 4.  RE: Datasources being disabled - ESXiMonitorPython / Python Collector

    Posted 9 days ago
    This one reared its ugly head at me. I added the parameters Jane mentioned, and now am getting a message "Process set contains 0 running processes: zenpython. Backing out the zenpython.conf changes doesn't seem to help. Any troubleshooting ideas?

    ------------------------------
    Paul Giordano
    Senior Systems Engineer
    Zethcon Corporation
    ------------------------------



  • 5.  RE: Datasources being disabled - ESXiMonitorPython / Python Collector

    Posted 9 days ago
    I think the ""Process set contains 0 running processes: zenpython" message is from zenprocess rather than zenpython.  There were certainly some versions of Zenoss that created these erroneously - what version are you at?

    It is basically saying that you have process monitoring configured for a process and that process isn't running - but it's not exactly the most user-friendly event on the planet ;)

    Cheers,
    Jane

    ------------------------------
    Jane Curry
    Skills 1st United Kingdom
    jane.curry@skills-1st.co.uk
    ------------------------------



  • 6.  RE: Datasources being disabled - ESXiMonitorPython / Python Collector

    Posted 9 days ago
    Edited by Paul Giordano 9 days ago
    Thanks Jane. We're running 6.1.2. How would I troubleshoot the original problem that this ticket mentioned? This is strange, it just started happening last night. Been running fine before that.

    ------------------------------
    Paul Giordano
    Senior Systems Engineer
    Zethcon Corporation
    ------------------------------



  • 7.  RE: Datasources being disabled - ESXiMonitorPython / Python Collector

    Posted 7 days ago
    So, the zenpython logs show the datasources disabled. How do I re-enable them?

    ------------------------------
    Paul Giordano
    Senior Systems Engineer
    Zethcon Corporation
    ------------------------------



  • 8.  RE: Datasources being disabled - ESXiMonitorPython / Python Collector

    Posted 7 days ago
    Edited by Paul Giordano 7 days ago
    Answered my own question, go to Advanced -> Monitoring Templates, expand ESXiHost, select /Devices/VMWare/ESXiHost, double click on the datasource, disable and enable it again. This doesn't fix the original problem, but it resets the disabled datasources.

    Interesting, when I restart zenpython I get the following:
    2018-12-07 14:57:53,178 INFO zen.python: plugins disabled by watchdog: ['ZenPacks.community.VMwareESXiMonitorPython.datasources.VMwareDataSource.VMwareDataSourcePlugin']
    2018-12-07 14:57:53,178 INFO zen.python: starting watchdog with 100.0s timeout
    2018-12-07 14:57:53,216 INFO zen.zenpython: Connecting to localhost:8789
    2018-12-07 14:57:53,237 INFO zen.zenpython: Connected to the zenhub/0 instance


    Still getting the original messages, even after doing the above. Changed the blockingwarning at 30 seconds and blockingtimeout at 100, still getting the messages. Any help or pointers appreciated.

    ------------------------------
    Paul Giordano
    Senior Systems Engineer
    Zethcon Corporation
    ------------------------------



  • 9.  RE: Datasources being disabled - ESXiMonitorPython / Python Collector

    Posted 7 days ago
    Paul, I know you mentioned in another thread you were running version 6.1.2; if you can, I'd highly recommend moving to 6.2.1, as version 6.1.x had some problems with memory leaks, exceptions being thrown, zombie threads running at 100% utilization and other weirdness like graphing just....stopping for hours at a time until the next scheduled device remodel. Version 6.2.1 has been pretty solid, with caveats.

    ------------------------------
    Jason Olson
    ------------------------------



  • 10.  RE: Datasources being disabled - ESXiMonitorPython / Python Collector

    Posted 7 days ago
    Sigh. OK, I'll try to upgrade this weekend, but I'm thinking it's a bigger job that just installing the new code... We'll see.

    ------------------------------
    Paul Giordano
    Senior Systems Engineer
    Zethcon Corporation
    ------------------------------



  • 11.  RE: Datasources being disabled - ESXiMonitorPython / Python Collector

    Posted 5 days ago
    Hi,

    In terms of unblocking the blocked datasources, this involves removing the name of the blocked datasource from either /var/zenoss/zenpython.blocked on Zenoss 5, or /opt/zenoss/var/zenpython.blocked on Zenoss 4 and then restarting zenpython daemon.

    Full info / discussion on this on the python collector page in the comments:
    https://www.zenoss.com/product/zenpacks/pythoncollector

    Hope this helps!

    ------------------------------
    Pheripheral Pheripheral
    ------------------------------



  • 12.  RE: Datasources being disabled - ESXiMonitorPython / Python Collector

    Posted 4 days ago
    Removing or editing this file is definitely the way to get the datasources collecting again - but it doesn't get to the root cause of the problem which is that devices are not responding to zenpython fast enough and are then "blocking".  The real problem (or certainly was back when I developed this) is that once a datasource blocks for any device, the datasource is blocked for ALL devices ie. it is put in thezenpython.blocked file.

    In a perfect world, I need to rewrite some of this code to ensure that it never blocks....

    Absolutely no promises - but how many people are affected by this??  Please report here.

    Cheers,
    Jane

    ------------------------------
    Jane Curry
    Skills 1st United Kingdom
    jane.curry@skills-1st.co.uk
    ------------------------------



  • 13.  RE: Datasources being disabled - ESXiMonitorPython / Python Collector

    Posted 4 days ago

    Hi,

    We're certainly affected by this.

    Although pleasingly have not seen it for a while but then we have currently set the blocking timeout to 0, i.e. do not use blocking timeout! as we can't be in a situation where as there may not be access to zenoss by someone capable of unblocking it for some time.

    Thanks
    Dafydd



    ------------------------------
    Pheripheral Pheripheral
    ------------------------------



  • 14.  RE: Datasources being disabled - ESXiMonitorPython / Python Collector

    Posted 2 days ago
    This has been something of a pain for myself, as well - I have a task on my end to try and figure out what is causing us to run long, and I suspect it might have something to do with when we upgrade our ZenPacks and update our remote collectors. I have no conclusive evidence, but it is kind of a pain to have to try and track down the exact cause of it (i.e. the point that took too long to run - perhaps it's there and I'm just missing it in my cursory searching, however).

    ------------------------------
    Austin Culbertson
    NOC Monitoring Engineer
    ------------------------------