Blog

A window into the technology and business of fax...

Fatal Error: Whose Fault Is It, Anyway?

28 June 2024

Fatal fax protocol communication errors, also known as “fatal errors”, occur when fax communication between two endpoints on an established telephone call breaks down to a point where one or both of those endpoints decides to disconnect and terminate the call rather than to continue with recovery measures.

Although it is technically possible for these fatal errors to occur because of blatant flaws or bugs in the programming and implementation of fax protocol (ITU T.30) by one or both of the endpoints or a T.38 gateway in-between the two endpoints, such is generally a rare and unlikely situation. Fatal errors are much more likely to be caused by poor audio quality on or disruptions in the established telephone call leading to one or both of the endpoints or a T.38 gateway in-between the two endpoints to become confused, cease recovery measures, and terminate the call.

However, another fax session may succeed in an identical condition and situation depending on the resiliency of the implementation of fax protocol and design of the recovery measures. The bulk of that responsibility lies with the receiving end and any T.38 gateways operating on its behalf.

In a typical fax communication the vast majority of the signaling is going from the sender to the receiver. After the initial handshaking period known as “Phase B” where the receiver indicates its capabilities and the sender identifies its selection of those capabilities, the only signals sent from the receiver to the sender are very short and are in regards to receipt confirmation. The sender will transmit page image data that could last 30 seconds or more after which the receiver will signal receipt confirmation that should last less than one second.

In an ideal situation where no T.38 gateways are involved and the audio between a sender and a receiver is flawless and undisturbed then chances are extremely high that a sender could successfully transmit an unlimited number of faxes with an unlimited number of pages indefinitely to a receiver without any fatal errors ever occurring. However, although such ideal situations are by no means impossible to have in actuality, the likelihood is that in most fax calls of much duration there will be at least some disturbance or interruption of the fax audio as it goes along the telephone network from the sender to the receiver. Neither the sender nor the receiver can control or influence that. However, only the endpoint receiving the signals is capable to cope with or recover from disturbances and interruptions in those signals. Since the vast majority of the signals are going from the sender to the receiver it is, therefore, of great importance for the receiver to employ robust and resilient recovery measures.

This is not to say that the receiver should be able to cope with anything and everything that the sender transmits and the telephone call confuses. Certainly the call audio quality can be so poor that no amount or quality of recovery measures by the receiver could mitigate. Furthermore, although the signals from the receiver to the sender are brief, the sender also must be resilient in coping with interruptions and disturbances in those as well as responding properly to the receiver’s signals. And certainly the sender should always employ error correction mode (ECM) whenever possible as it makes the receiver's job at being resilient much easier. However, no matter how perfectly the sender performs its task, and no matter how limited the interruption and disturbance of the audio quality may be, if the communication is likened to a game of catch where the thrower perfectly tosses a ball into the glove of the catcher, the catcher must still close the glove as that ball arrives, or that ball will be dropped.

So, it is with good reason to discuss “success rates” or “failure rates” when it refers to fax reception. If out of 100 fax calls 5 of them do not result in successful fax receptions, it is reasonable to consider that a 95% “success rate” or a 5% “failure rate” reflects meaningfully on the attributes of the receiving fax equipment.

However, the same approach cannot be done reasonably for fax sending. Consider a case where a fax is sent successfully to 98 different receivers on the first attempt, but to one other receiver it succeeded only after 5 attempts to send it, and to yet another receiver it failed on each of ten attempts to send it. Does this represent a 12% “failure rate” (so the number of failures divided by the number of attempts), a 2% “failure rate” (the summed ratio of failures for all receivers divided by the total number of receivers), a 0% “failure rate” (the mean failure rate of all the receivers), a 1% “failure rate” (the number of receivers that never successfully received the fax job divided by the number of intended receivers), or some other value? What do any of these values actually tell a fax systems administrator, anyway? How does one meaningfully discern the quality of the attributes of the sending fax equipment from these values when these values are overwhelmingly based on the attributes of the pool of targeted receiving fax equipment and (probably to a lesser extent) the occurrence of audio quality problems on the lines used?

Again, none of this is to excuse the sender from strictly adhering to the standards of fax protocol and implementing adjustments in order to achieve compatibility with the receiver. However, there is only so much of this that the sender can do. At some point it’s really up to the receiver to cope with the seemingly random bits of interference, jitter, and confusion that occur due to commonplace line audio quality issues. That point falls much more in the receiver’s corner than it does in the sender’s.

This situation leads to considerable frustration on the part of fax systems administrators. It is the sender which will be more keenly aware of the fatal error condition. They will see that their transmissions are failing. However, unless a receiver is paying close attention they may not even notice a problem. If the sender alerts the receiver to the problems in sending them a fax it is not unlikely that the receiver will insist that their equipment is not at fault because they are receiving some faxes and are not hearing about problems from anyone else. Of course, that receiver likely didn’t check to see how many attempts it took for the faxes they did receive to be successful. That data is likely obscured from their typical views.

Rather than retrying faxes endlessly until they go through successfully, fax system administrators should be proactive about reaching out to receivers through another method of communication in order to alert them to the problems they have in sending them faxes. And fax system administrators should be of the mindset to accept responsibility by default and escalate the matter to their fax system developers if they are on the receiving end of that communication. Fax system developers should be at the ready to continually improve the resiliency and utility of their recovery measures employed in their fax operations as Mainpine does.

For a good idea on where to start for fax system developers see here.