One of the focus areas for new development in MQ in recent years has been in the area of High Availability and Disaster Recovery. Technologies such as RDQM and Native HA, and automatically managed logfiles, give a range of possibilities for ensuring your messaging systems continue reliably. Alongside the core function, there are also metrics and status information to show more about what is going on. And so the latest updates to the open source monitor programs add collection of some of these recently-added values. This should simplify monitoring MQ availability.
Collection of the metrics that the queue manager publishes requires that each monitored queue has at least one associated subscription. This post describes an interesting option where collection programs use durable subscriptions to minimise object handle use when running MQ monitoring. It reduces the requirements for configuring the MAXHANDS attribute on the queue manager. It’s also a nice demonstration of how subscriptions could be used in any application.
The mq-metric-samples collectors that send IBM MQ metrics and status data to a range of databases, ready to be viewed in Grafana, have just been enhanced to collect additional information. The Prometheus collector has also been extended so that it can continue providing limited status even when the queue manager is down.
The new metrics have all been suggested by users of the package either directly or via issues raised in the GitHub repository. Many previous articles on here show more about the collectors.
The InfluxDB collector is also refreshed for a new version of the database.
It is odd how often similar questions come at the same time from unrelated places. Because of the projects and articles I’ve written about visualising MQ’s metrics in Grafana, I recently had a couple of people asking about using that same front-end but able to work with log-file data. Specifically would it be possible to use Loki and Grafana together with MQ. My immediate reaction was that I didn’t know what Loki was in this context, but clearly they couldn’t be asking about a Norse god or Marvel character. Instead after a very short search and a few minutes reading, I guessed that it ought to be possible to use it. Which was my reply.
But of course, I’m not likely to leave it there when I learn about a new tool. Especially when I have a reasonable starting point like a working local Grafana setup. And so I have done some very quick experiments to prove that the approach really can work. On the way I’ll also take a brief digression into using logrotate.
This post shows how you can use Grafana to selectively view information about your MQ configuration. Which may sound a little odd. Grafana’s strength is primarily to show statistics and metrics in pretty graphs. So why would we want to use it to look at queue definitions? The answer is that you usually would not! There are many more appropriate tools for displaying and updating the queue manager configuration – even the MQ Explorer or MQ Console are better. But there may be times when a limited set of information may be desirable, so you can link from a graph to a different view, within the same tool.
But another important aspect that I hope this shows is the power of a common data format. The techniques I’ll show here could be used to combine a variety of different tools, and perhaps this will give you some ideas.
In 2016 I wrote about how MQ’s resource statistics can work with a number of time-series databases, including Prometheus. This permits monitoring using the same tools that many customers use for monitoring other products. It allows easy creation of dashboards using tools such as Grafana.
Since that original version, we’ve made a number of enhancements to the packages that underpin that monitoring capability. For example, more database options were added; a JSON formatter appeared. One notable change was when we split the monitoring agent programs into a separate GitHub repository, making it easier to work with just the pieces you needed.
And now, I’ve released some changes that allow Prometheus and generic JSON processors to see some key channel status information. In particular, a Grafana dashboard can easily highlight channels that are not running.
MQ V9 added resource monitoring statistics that you can subscribe to. In this post I’m going to show how you can generate similar statistics from your own applications using the same model. For example, you may want to track how many successful and how many failed messages are being processed.
An earlier blog entry showed how to integrate MQ with the Prometheus database, capturing statistics that can then be shown in a Grafana dashboard. In this article, I’ll show how that initial work has been extended to work with more databases and collection tools.
In a previous blog entry I wrote about using the Go language with MQ. One of the reasons for creating that Go package was to enable the creation of a program that sends MQ statistics to Prometheus and hence to be easily visualised in Grafana. This blog shows how it all fits together.
MQ V9 metrics
MQ V9 (and the MQ appliance) makes many statistics available through a pub/sub interface. One huge benefit of the pub/sub model is that this data can be collected without interfering with any other monitoring programs. An early prototype of the MQ exporter for Prometheus used the RESET QSTATS command just to prove the concept, but that is not a good command to use in general when you have any other tools that may also use it. Publish/subscribe gives easy isolation for monitors.
Prometheus is an open-source monitoring and alerting solution, whose particular strength is the collection of time series data, with the ability to easily query that data. For example, the number of MQPUTs to a queue may be of interest, and this kind of database makes it easy to see how many operations occurred in an interval, or calculate averages. Prometheus works by pulling information from exporters such as this MQ program at configured intervals over an HTTP connection. It provides libraries in several languages to enable products to export data to it, but the most commonly used is probably the Go library – hence the need for an MQ Go package.
What is Grafana
Grafana provides a way to create dashboards and visualise data held in time series databases. It has Prometheus as a built-in data source, making this pair of products a natural fit together.
Getting started with the monitor
Building the monitor
The github repository contains the monitoring program, the ibmmq package that links to the core MQ application interface and other prerequisite components.
should pull down the client code and its dependencies. The README file in the root of that package shows how to compile the code, either locally or within a Docker container.
It is convenient to run the monitor program as a queue manager service, automatically started and stopped along with the queue manager.
The source code directory contains an MQSC script to define such a service. In fact, the service definition points at a simple script (also provided) which sets up any necessary environment and builds the command line parameters for the real monitor program. As the last line of the script calls “exec”, the process id of the script is inherited by the monitor program, and the queue manager can then check on the status, and can drive a suitable STOP SERVICE operation during queue manager shutdown.
Edit the MQSC script and the shell script to point at appropriate directories where the programs exist, and where you want to put stdout/stderr. Ensure that the mqm id running the queue manager has permission to access the programs and output files.
The monitor listens for calls from Prometheus on a TCP port. The default port, reserved for this use in the Prometheus list, is 9157. If you want to use a different number, then use the -ibmmq.httpListenPort command parameter.
The monitor always collects all of the available queue manager-wide metrics. It can also be configured to collect statistics for specific sets of queues. The sets of queues can be given either directly on the command line with the -ibmmq.monitoredQueues flag, or put into a separate file which is also named on the command line, with the -ibmmq.monitoredQueuesFile flag. An example is included in the startup shell script. For example,
starts the monitor to collect the statistics for all queues whose names begin APPA and APPB.
Note on queue patterns
For now, the queue patterns are expanded only at startup of the monitor program. If you want to change the patterns, or new queues are defined that match an existing pattern, the monitor must be restarted with a STOP SERVICE and START SERVICE pair of commands.
The Prometheus server has to know how to contact the MQ monitor. The simplest way is just to add a reference to the monitor in the server’s configuration file. For example, add this block to /etc/prometheus/prometheus.yml with any changes needed for your hostnames and ports.
# Adding a reference to an MQ monitor. All we have to do is
# name the host and port on which the monitor is listening.
# Port 9157 is the reserved default port for the MQ monitor.
- job_name: 'ibmmq'
- targets: ['hostname.example.com:9157']
The Prometheus documentation has information on more complex configuration options, including the ability to pull information on which hosts should be monitored from a variety of discovery tools.
Once the Prometheus server has picked up the MQ configuration, the metrics can be seen under the jobname of ibmmq. The values are labelled with the queue and queue manager names, to assist with selection. This picture shows some of the available information in the selection drop-down:
You can select an item from this panel and see its recent values with the queue and queue manager labels. For example,
However, it is more flexible to work with the graphing and dashboard views from Grafana.
Once the Prometheus system is working, grafana can use it as a datasource – again, only a hostname and portnumber is required when adding this type of datasource. And from there, all of the MQ metrics can be accessed and added to dashboards. As an example, this dashboard is looking at several items including the same queues as above, and CPU and logging information:
This picture shows how the top panel was configured, to select several metrics and show the object name in the legend:
Deployment in Docker containers
All of these components can be configured to run inside Docker containers to simplify deployment. To get started, almost everything in the existing Prometheus and Grafana containers can be left to default, except for the need to add the MQ configuration to prometheus.yml. For example, I have this simple Dockerfile
ADD prometheus.yml /etc/prometheus/prometheus.yml
where I’ve added the ibmmq block shown above to the default yml file.
And then this script gets both the Prometheus and Grafana components running, using local directories under /var/docker to hold their persistent data:
The MQ exporter program and its configuration can also of course be baked into a Docker image. The MQ docker image on Github has information on the configuration of MQ. The service definition, the shell script and the actual monitoring program can all be copied into a new image.
This article has shown how the statistics generated by MQ can easily be used in some of the monitoring packages that are commonly used with various cloud and container-based systems. The MQ data can be integrated with other metrics to give a complete view of your environment.
I would welcome feedback on this tool. Please leave any feedback here, or in the GitHub issue tracker, whether bugs, enhancements, or thoughts on the value of the monitoring.
This post was last updated on November 27th, 2021 at 03:00 pm