Warning: is it an error

Something I dealt with recently in independent exchanges with several different people was about programming with IBM MQ, and dealing with MQI errors that might not be errors. This post is a short discussion about the not-quite-failed status of a warning. Is it an error? What is the difference between MQI errors and warnings?

Introduction

The procedural MQI is the interface that you are likely to be using for MQ applications written in languages like C, C++ or COBOL. There is a common pattern for all the verbs in the MQI, which always return two values – the CompletionCode and the Reason. Between them, these values tell you whether the MQI verb (MQPUT, MQSET etc) ran successfully.

The idea is that the CompletionCode gives a simple check on the outcome. The Reason then provides an extra level of detail, hopefully telling you what really went wrong. This is a very common pattern across many library interfaces. Unix system calls, for example, typically return -1 on error. To get more information about the precise failure, you look at the errno value.

But in MQ, the original designers chose to give some “helpful” classifications. The CompletionCode has three possible values – OK, Warning and Failed. Intuitively you can usually guess whether a particular Reason should be considered a failure or a warning and the documentation can confirm or correct your feelings. Running man mqget for example shows them quickly on a Unix installation. But it’s not always obvious. And it lead to thoughts about how you may need to write programs for reliability.

Testing for problems

Just about all MQ C programs that I have seen tend to treat the CompletionCode as a binary indicator. You often see patterns like these:

if (compCode == MQCC_OK) {  
  continue work ...
}

or

if (compCode == MQCC_FAILED) { 
  do some error reporting; exit
}

But too often, consideration of the third possibility is ignored. Which might be fine for the logic of example fragments, especially when they don’t try to do things like retries or recovery. But it does not necessarily set a good example to be copied elsewhere. And that is part of the reason I’m writing this article. The above fragments might miss situations where your real application needs to do something special. A fuller approach would probably consider all the options individually. For example:

switch (compCode) {
case MQCC_OK:
  continue=TRUE; break;
case MQCC_WARNING:
  decide continue or report an error based on specific MQRC values;
  break;
case MQCC_FAILED:
  do some error reporting;
  continue=FALSE;
}
if (continue) { 
 ...
}

But I don’t think that is a common way of writing applications. Even though some of the sample programs like amqsput0.c do actually demonstrate something like that – not precisely the same code, but logically equivalent – for some of the verbs. What does tend to get written instead is code that starts with a combined view of the warning/failure and then gets specific. For example:

if (compCode != MQCC_OK) {
  // Look for a very specific circumstance - the MQCC FAILED/WARNING is 
  // actually irrelevant for this MQRC value
  if (reason == MQRC_TRUNCATED_MSG_FAILED) {
    do something special here
  }
}

One of the discussions I was involved with recently involved a situation where MQCONNX returned a warning. The developer had written his testcase to only consider the failure completion code. He was then wondering why the changes he’d made to the queue manager code were not affecting the outcome of the test in the way he’d expected.

Has a message been removed?

While many of the MQI verbs can return a warning, the most interesting case is perhaps with MQGET. This is the one I found least intuitive. It was related to code I wrote that I then discussed more in this post.

The simplest expectation of MQGET is that OK should means you have a message and FAILED should mean that you don’t have any message.

But it’s not quite that simple! In some circumstances, a warning might mean that a message has been permanently removed from the queue. In other circumstances, a message has NOT been removed from the queue. This will usually be to do with truncation, where your application has not provided a large enough buffer for the full message data to be returned.

Both MQRC_TRUNCATED_MSG_ACCEPTED and MQRC_TRUNCATED_MSG_FAILED are considered warnings, though the first one has taken the message from the queue while the second has not. And it is still a warning despite the reason containing the word “FAILED”. I wrote quite a lot about message truncation in my API Exits article, where a lot more care may need to be taken. But it’s still something to consider in normal application programming, not just exits.

And I’ve never been totally convinced that MQRC_NO_MSG_AVAILABLE (2033) should really be an failure, though given the binary nature of how most application programs deal with the CompletionCode, that’s probably the safest approach. Of course, that is never going to change now despite my misgivings.

Other languages

Outside of the shipped procedural MQI bindings for languages such as C and COBOL, the error handling is often managed differently. In JMS and XMS, most of the warning cases are not exposed to the application program. It may even be impossible to write the application in such a way as to cause most of the warnings. Exceptions are used when there are errors to report. Though don’t forget that as well as the standards-imposed exception types, more detail is accessible through methods like getLinkedException. Those methods expose the underlying MQRC value.

Other language interfaces usually try to follow the common error-handling patterns for that language if one exists so that the MQI can be used in a natural fashion.

For example, the standard model for all Go programs is that the LAST return value from every function is an error object (a Go function can return multiple values). Which leads to patterns such as

value, err := MQVERB(parms)
if err != nil {
  do some error reporting
}

Similarly, the JavaScript pattern for Node.js has an object holding any error details as the FIRST parameter to the callback functions, invoked after a verb completes:

mq.Verb(parms,function(err, value) {
  if (err != null) {
    ...
  }
});

When I designed both the Go and Node.js language bindings for MQ, I chose to combine the warning and failure completions with the reason code. Only the MQCC_OK value does not create an error object. That gives a non-null object for both failures and warnings, matching the normal pattern for the language. You can then break out both the MQCC and MQRC values from those objects if you need to. The Go version looks a bit ugly, but it is reasonably idiomatic:

mqret := err.(*ibmmq.MQReturn)
if mqret.MQRC == ibmmq.MQRC_NO_MSG_AVAILABLE { ...

The Python bindings behave similarly, raising exceptions for warnings as well as failures. And you can interrogate the exception for both the MQCC and MQRC values.

Conclusion

There is a lot of power and flexibility in the MQ APIs, but that flexibility means that you need to code for yes/no/maybe responses – not just yes/no – to create reliable and supportable applications.

I hope this has been useful.

This post was last updated on November 15th, 2021 at 10:00 am

Leave a Reply

Your email address will not be published. Required fields are marked *