Saturday, April 05, 2008

Custom Zenoss graph based on multiple data points

(This tip is also on the Zenoss Wiki.)

If you want to make a custom graph in Zenoss based on more than one data point (such as a ratio or other calculation), you will need to enter a custom graph definition for RRDTool to use. I found some good guides on how to define graphs with RRDTool (such as this tutorial on CDEF and others at that site), but it took me a while to put this together with the available data points and variables in Zenoss so the graph would work.

Edit the performance template to which you wish to add the graph. Click the drop-down arrow next to Graph Definitions and choose "Add Graph..." and name it.

Click on the Graph Custom Definition tab and you are presented with a blank slate for your new graph's definition. It may be easiest to start with an example. I entered the following custom graph definition:

DEF:BusyThreads-raw=${here/fullRRDPath}/appThreads_BusyThreads.rrd:ds0:AVERAGE
DEF:RequestsPerSecond-raw=${here/fullRRDPath}/appThreads_RequestsPerSecond.rrd:ds0:AVERAGE
DEF:AppCurrentConnections-raw=${here/fullRRDPath}/currentConnections_appCurrentConnections.rrd:ds0:AVERAGE
CDEF:connectionsToThreads=AppCurrentConnections-raw,1,RequestsPerSecond-raw,BusyThreads-raw,+,+,/
LINE:connectionsToThreads#00cc00:"Connections to Threads/Thread Activity Ratio"
GPRINT:connectionsToThreads:LAST:cur\:%5.2lf%s
GPRINT:connectionsToThreads:AVERAGE:avg\:%5.2lf%s
GPRINT:connectionsToThreads:MAX:max\:%5.2lf%s\j

To break this apart, I have two data sources and three data points involved in my ratio that are part of the performance template with this graph:

Data source: appThreads

This data source has two data points, BusyThreads and RequestsPerSecond.

Data source: currentConnections
This data source has one data point, appCurrentConnections.

What I want to graph is a ratio based on these data points as follows:

appCurrentConnections / (BusyThreads + RequestsPerSecond + 1)

Basically, I want a measure of the amount of work in the queue (current connections) divided by the amount of work output my application is producing (a combination of the busy threads and requests per second it is handling, plus one to avoid the possibility of a divide-by-zero error).

With that established, we need to define the RRD DEFs (variables) used in the graph, one for each of the variables in the above calculation. Here's the one for the busy threads variable. I supplied BusyThreads-raw as the name that is used in the graph line:

DEF:BusyThreads-raw=${here/fullRRDPath}/appThreads_BusyThreads.rrd:ds0:AVERAGE


The key above is the TALES expression to get the variable from the Zenoss performance template into our RRDTool DEF variable: ${here/fullRRDPath}/dataSourceName_dataPointName

Regarding the :AVERAGE at the end: While there are many different RRD functions, the most common one I've seen used is the AVERAGE function, which takes a recent rolling average of the value in question. Please consult the RRDTool documentation for going deeper with this.

After providing DEF lines for each variable in my calculation, I need a CDEF line (calculated definition) for the actual calculation that puts the calculation together:

CDEF:connectionsToThreads=AppCurrentConnections-raw,1,RequestsPerSecond-raw,BusyThreads-raw,+,+,/

The calculation uses reverse Polish notation and the CDEF tutorial above has an excellent guide to understanding it, but basically you can think of it as a stack: the variables and constants are pushed onto the stack in order from left to right, and when the first operator (the leftmost plus sign) hits the stack, the top two items (in this case, the BusyThreads-raw and RequestsPerSecond-raw variables) are popped off the stack and added together (the operator is applied). The result is pushed back onto the stack. The next plus sign adds this sum with 1, and finally the division operator divides the AppCurrentConnections-raw variable by the topmost stack item (1 + RequestsPerSecond-raw + BusyThreads-raw).

Once we have our connectionsToThreads variable, we can graph it. The next line defines the one line on our graph:

LINE:connectionsToThreads#00cc00:"Connections to Threads/Thread Activity Ratio"

It refers to the connectionsToThreads variables, defines a color in hex notation, and defines a label. Finally, we can print some additional information on the graph:

GPRINT:connectionsToThreads:LAST:cur\:%5.2lf%s
GPRINT:connectionsToThreads:AVERAGE:avg\:%5.2lf%s
GPRINT:connectionsToThreads:MAX:max\:%5.2lf%s\j


Here we print the last, average, and maximum values of our graph line on the currently-viewed graph section.

Limitation: Thresholding
One thing I could not get working was to define a threshold based on my calculated value above. It seems that the thresholds are only valid on the values of the data points themselves, and I couldn't get a threshold working on my derived value above.

No comments:

Post a Comment