Configuration & Administration

Expand all | Collapse all

"Discarded event - queue overflow" messages in zenpython log

  • 1.  "Discarded event - queue overflow" messages in zenpython log

    Posted 05-22-2020 12:30 AM
    We are running Zenoss 6.2.1 (Community). Recently we started getting a lot of these messages in the zenpython service log:

    WARNING zen.zenpython: Discarded event - queue overflow: .....[text cut]

    The queue length in MetricConsumer is at 0, and the queue length in rediscollector seems low (average of 50?).

    The rabbitmq log has a lot of "shutdown_error" messages:
    SUPERVISOR REPORT==== 22-May-2020::05:01:50 ===
         Supervisor: {<0.8363.23>,rabbit_channel_sup_sup}
         Context:    shutdown_error
         Supervisor: {<0.8363.23>,rabbit_channel_sup_sup}
         Context:    shutdown_error
         Reason:     shutdown
         Offender:   [{nb_children,1},
         Reason:     shutdown
         Offender:   [{nb_children,1},
                      {name,channel_sup},
                      {name,channel_sup},
                      {mfargs,{rabbit_channel_sup,start_link,[]}},
                      {mfargs,{rabbit_channel_sup,start_link,[]}},
                      {restart_type,temporary},
                      {shutdown,infinity},
                      {restart_type,temporary},
                      {child_type,supervisor}]
                      {shutdown,infinity},
                      {child_type,supervisor}]


    This might be the root cause (or at least related to the problem).

    Any ideas?

    Thanks in advance,



    ------------------------------
    Larry
    ------------------------------


  • 2.  RE: "Discarded event - queue overflow" messages in zenpython log

    Posted 06-02-2020 03:35 PM

    Larry,

    MetricConsumer and CollectorRedis are both used to pass performance metrics from their source collector service back to OpenTSDB and shouldn't have any effect on event data.

    After a collector service (zenpython, in this case) generates an event, it sends it to zenhub for validation and entry into the event processing pipeline (zeneventd, zeneventserver, zenoss_zep database in MariaDB, and zenactiond should a trigger match).  If zenhub is not available to receive an incoming event, the collector service will temporarily cache the event in an internal event queue.  By default, this queue will cache the first 5000 events it receives and will eject events in a FIFO fashion should the queue fill.

    If you're seeing channel shutdown errors in RabbitMQ, it's likely the result of a Rabbit queue consumer disconnecting.  Whether that's through an error or the result of that service restarting is hard to tell from here.

    I would double-check your zenhub log to see if it has any complaints around the same time as the queue overflow messages.

    If you'd like to know more about event processing, we've got a handy video that covers the event pipeline here:


    Let us know if it helps?



    ------------------------------
    Michael J. Rogers
    Senior Instructor - Zenoss
    Austin TX
    ------------------------------