Handling MQ logs and events with OpenTelemetry

OTel logo

One recent addition to the plethora of observability tools is OpenTelemetry. It attempts to provide a vendor-agnostic set of common APIs, components, interfaces and protocols that enable interoperability between a range of other tools. It deals with three major pillars of telemetry data, the things you often need to look at when monitoring systems: traces (by which it means application-level data flows), metrics, and logs.

There are already ways of tracking messages through an MQ network and beyond, reporting via OpenTelemetry. And I will soon be talking a lot more about MQ metrics and OpenTelemetry. But as an appetiser, this post shows the third piece of the story: logs.

Earlier articles on event processing

I have written previously about sending MQ event messages to Loki, another “log processing” tool. That article is essentially identical for how we would make events available through OpenTelemetry. The enhancements to the amqsevt sample in MQ 9.2.4 simplify a couple of the intermediate steps, but it’s still essentially the same: use amqsevt to format the events as JSON and write them to somewhere that can be picked up by the processor.

So this time round, I’ll start by showing how to work with the MQ logs.

MQ Error Logs

First, you need to configure the queue manager to produce the error logs as JSON. In the queue manager’s qm.ini file, we have

DiagnosticMessages:
   Service = File
   Name = JSONLogs
   Format = json
   FilePrefix = AMQERR

After a queue manager restart, you can see both the JSON and text version of these logs in /var/mqm/qmgrs/QM1/errors.

OpenTelemetry Collector

One key component from the OTel project is the Collector. It acts as a proxy – receiving data from a monitored system, potentially doing some work on that data, and then exporting the results on to other tools. It provides a framework for configuring workflows, with a variety of plugins that connect the OTel model to other external systems or process the data.

This proxy is what we will use to read and parse the MQ error logs. One of the input paths (known as a receiver) to the Collector is the filelogreceiver. There is more documentation on this component here, but we can get things working with some very basic configuration.

We have to declare that a particular receiver is going to be used, what its configuration is, and where it fits in a workflow. In the Collector’s config.yaml file we have separate blocks for this. First, the declaration and configuration of the receiver:

receivers:
  filelog:
    include:
    - /var/mqm/qmgrs/QM1/errors/AMQERR01.json
    operators:
      - type: json_parser
        timestamp:
          parse_from: attributes.ibm_datetime
          layout: "%Y-%m-%dT%H:%M:%S.%LZ"
        severity:
          parse_from: attributes.loglevel

We only need to name a single input file here. Though if monitoring multiple queue manager with the same Collector, you could replace the queue manager name with a wildcard. The receiver program handles renaming and truncation of the log file, so it should never need to look at AMQERR02 or AMQERR03. The timestampoption here defines how to parse the field in the MQ logs that has the time of the entry. Similarly, we can pull the severity from standard fields in the entry.

Some interesting options to the receiver that I’ve not put in here include the ability to stash the current offset, so if the Collector has to restart, it should not process the same log entry twice.

We then need to put this receiver into a workflow:

service:
  pipelines:
    logs:
      receivers: [filelog]
      exporters: [debug]

You would probably want to add more exporters to the list here, depending on where you want to send the log data. But this is good enough to cause the parsed entries to be displayed from the Collector.

If I wanted to have multiple instances of the filelogreceiver, with different configuration options, all in the same Collector, then the syntax is to give each instance a unique qualified name. For example:

receivers:
  filelog/config1:
    include: ....
  filelog/config2:
    include: ....

service:
  pipelines:
    logs:
      receivers: [filelog/config1, filelog/config2]

And for some configurations, perhaps where you cannot directly access the MQ log files, then you should look at other receivers. Maybe the syslogreceiver would be helpful.

Collector Output

Assuming that the Collector has access to the generated log files, after it starts it will process those log files. My configuration simply printed the entries to stderr. But that was sufficient to prove successful reading and parsing of those logs.

I got this output (showing just the first reported entry here):

2024-02-09T16:08:27.620Z        info    fileconsumer/file.go:268        Started watching file   {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "path": "/var/mqm/qmgrs/QM1/errors/AMQERR01.json"}
2024-02-09T16:08:27.631Z        info    ResourceLog #0
Resource SchemaURL:
ScopeLogs #0
ScopeLogs SchemaURL:
InstrumentationScope
LogRecord #0
ObservedTimestamp: 2024-02-09 16:08:27.620529523 +0000 UTC
Timestamp: 2024-01-24 07:44:33.428 +0000 UTC
SeverityText: INFO
SeverityNumber: Info(9)
Body: Str({"ibm_messageId":"AMQ6287I","ibm_arithInsert1":0,"ibm_arithInsert2":0,"ibm_commentInsert1":"Linux 6.6.8-100.fc38.x86_64 (MQ Linux (x86-64 platform) 64-bit)","ibm_commentInsert2":"/opt/mqm (Installation1)","ibm_commentInsert3":"9.3.4.0 (p934-L230925.1)","ibm_datetime":"2024-01-24T07:44:33.428Z","ibm_serverName":"QM1","type":"mq_log","host":"localhost","loglevel":"INFO","module":"amqxeida.c:6863","ibm_sequence":"1706082273_428121152","ibm_processId":"3635139","ibm_threadId":"1","ibm_version":"9.3.4.0","ibm_processName":"strmqm","ibm_userName":"metaylor","ibm_installationName":"Installation1","ibm_installationDir":"/opt/mqm","message":"AMQ6287I: IBM MQ V9.3.4.0 (p934-30925.1)."})
Attributes:
     -> ibm_commentInsert2: Str(/opt/mqm (Installation1))
     -> ibm_processName: Str(strmqm)
     -> ibm_installationDir: Str(/opt/mqm)
     -> ibm_datetime: Str(2024-01-24T07:44:33.428Z)
     -> ibm_serverName: Str(QM1)
     -> host: Str(localhost)
     -> ibm_sequence: Str(1706082273_428121152)
     -> ibm_processId: Str(3635139)
     -> ibm_threadId: Str(1)
     -> ibm_version: Str(9.3.4.0)
     -> ibm_commentInsert1: Str(Linux 6.6.8-100.fc38.x86_64 (MQ Linux (x86-64 platform) 64-bit))
     -> log.file.name: Str(AMQERR01.json)
     -> type: Str(mq_log)
     -> message: Str(AMQ6287I: IBM MQ V9.3.4.0 (p934-L230925.1).)
     -> ibm_arithInsert2: Double(0)
     -> ibm_userName: Str(metaylor)
     -> ibm_installationName: Str(Installation1)
     -> loglevel: Str(INFO)
     -> ibm_messageId: Str(AMQ6287I)
     -> ibm_arithInsert1: Double(0)
     -> ibm_commentInsert3: Str(9.3.4.0 (p934-L230925.1))
     -> module: Str(amqxeida.c:6863)
Trace ID:
Span ID:
Flags: 0

The Timestamp field shows that the error log field is successfully parsed from the JSON; the ObservedTimestamp value shows when the Collector actually read the logfile.

Event Processing

Since amqsevt can format MQ events in JSON format, we can use much the same approach here.

One change I would recommend from the file-based processing path in the previous article, and from how I’ve processed the log files here, is to use the OpenTelemetry Collector’s namedpipe receiver (available from version 0.94.0 of the collector-contrib build), as that simplifies handling the output from amqsevt. It removes any need to deal with log rotation. The data now goes directly into the Collector. Use the mkfifo command to create a named pipe, and then redirect the amqsevt output into it. And the newer -o json_compact option helps the processing too.

Here’s an event message, dumped in the same way.

LogRecord #83
ObservedTimestamp: 2024-02-11 09:25:08.003819814 +0000 UTC
Timestamp: 2024-02-11 07:56:50 +0000 UTC
SeverityText: 
SeverityNumber: Unspecified(0)
Body: Str({ "eventSource" : { "objectName": "SYSTEM.ADMIN.PERFM.EVENT",                   "objectType" : "Queue",                   "queueMgr" : "QM1"}, "eventType" : {     "name" : "Perfm Event",     "value" : 45   }, "eventReason" : {     "name" : "Queue Depth Low",     "value" : 2225   }, "eventCreation" : {     "timeStamp"  : "2024-02-11T07:56:50Z",     "epoch"      : 1707638210   }, "eventData" : {   "queueMgrName" : "QM1",   "baseObjectName" : "APP.0",   "timeSinceReset" : 11,   "highQueueDepth" : 198,   "msgEnqCount" : 38,   "msgDeqCount" : 158 } })
Attributes:
     -> eventSource: Map({"objectName":"SYSTEM.ADMIN.PERFM.EVENT","objectType":"Queue","queueMgr":"QM1"})
     -> eventType: Map({"name":"Perfm Event","value":45})
     -> eventReason: Map({"name":"Queue Depth Low","value":2225})
     -> eventCreation: Map({"epoch":1707638210,"timeStamp":"2024-02-11T07:56:50Z"})
     -> eventData: Map({"baseObjectName":"APP.0","highQueueDepth":198,"msgDeqCount":158,"msgEnqCount":38,"queueMgrName":"QM1","timeSinceReset":11})
Trace ID: 
Span ID: 
Flags: 0
	{"kind": "exporter", "data_type": "logs", "name": "debug"}

Summary

For the error log information, no additional coding was needed. Just configuration files and ensuring that the OpenTelemetry Collector program had access to the error logs. This is exactly how OTel is meant to work, consuming common formats and exporting them onwards to a variety of backends.

I hope that helps you understand how one of the OpenTelemetry pillars can be integrated with MQ. Join me again soon, when we’ll talk more about Metrics.

This post was last updated on March 14th, 2024 at 06:47 pm

One thought on “Handling MQ logs and events with OpenTelemetry”

Leave a Reply

Your email address will not be published. Required fields are marked *