Fix Session Protocol over TCP/IP

 

Alexander Liss

 

02/03/2006

 

 

 

FIX Session Protocol Specs. 1

Administrative Messages. 1

Connection Health Check. 2

Heartbeat Protocol 2

Test Request Protocol 2

Specifics of Implementation. 2

Gap Filling Protocol 2

Notes. 3

Specifics of Implementation. 3

Logout Protocol 3

Specifics of Implementation. 4

Logon Protocol 4

Specifics of Implementation. 5

Forced Resetting of Sequence Numbers. 5

Notes. 5

Sequence Numbers. 5

Storing Sequence Numbers. 6

Amnesiac. 6

Regular Resetting of Sequence Numbers. 6

External Protocol: Resetting Sequence Numbers. 6

Buffered Data. 7

Resend Request with High Sequence Number 7

 

 

 

FIX Session Protocol Specs

 

Administrative Messages

 

            Following messages are used to manage FIX session:

 

 

Connection Health Check

 

            Heartbeat and TestRequest messages are used to validate a connection. The protocol of their use is governed by the time interval HeartBtInt, which is set during logon.

 

Heartbeat Protocol

 

Heartbeat is sent by an application, which has nothing to send for a period HeartBtInt.

 

Test Request Protocol

 

TestRequest is sent, when there are no messages received for a period HeartBtInt. It is sent with a TestRequestId. A party receiving TestRequest replies with a Heartbeat with TestRequestId in it.

            An initiator of Test Request Protocol has to detect such a reply in an incoming message stream. If it does not arrive in a HeartBtInt period, it has to terminate connection even without a Logout message.

 

Specifics of Implementation

 

The FIX Protocol cannot tie Connection Health Check for incoming and outgoing communication channels, because they could be unrelated. However, in case of TCP/IP communication protocol via one socket it could be a common timer, which is reset at any message transmission (incoming or outgoing) and this is how it is often implemented.

            To avoid excessive message exchange in a quiet time, Heartbeat sending should be more frequent than TestRequest. This is not what FIX Protocol requires, but this could be done, without any compatibility problems.

            These Protocols requires threads cooperation in a two-threads implementation.

           

Gap Filling Protocol

 

            A received message with a sequence number smaller than expected should cause disconnection and handling by a protocol external to FIX (manual).

            A received message with a sequence number larger than expected causes initiation of a Gap Filling Protocol.

            Gap Filling Protocol starts with a message ResendRequest with GapFillFlag=Y and specified range of messages to be resent.

            The other party resends requested messages with PossDupFlag=Y or with a SeqReset with GapFill=Y as a stand-in for a range of messages, which should be ignored at this stage of processing (administrative messages, expired orders, etc.).

            If a ResendRequest message is received, than this Gap Filling Protocol should be executed first and only after that own ResendRequest could be sent.

           

Notes

 

            Two ResendRequest messages crossing each other in the beginning of the session is quite possible. It is not defined in FIX Session Protocol Specs, what to do, when two ResendRequest messages cross each other.

            Fix Session Protocol Specs recommend sending ResendRequest with upper sequence number set to zero (a stand-in for infinity).

            Reject is the only administrative message, which could be resent, there is no action on other resent administrative messages including ResendRequest message arriving as a retransmitted message.

            Theoretically, messages could arrive out of sequence during Gap Filling protocol and this could trigger a next round of Gap Filling, but this is not a danger with TCP/IP.

 

Specifics of Implementation

 

            An initiator of Gap Filling Protocol could hold received messages with sequence number larger than the first missing message in a queue. However, if an acceptor is sending a large number of new messages before it start processing a (received) ResendRequest message, this could force the initiator to keep a large temporary queue of pending messages.

            When a communication protocol is TCP/IP, which guaranties delivery of the stream of data, Gap Filling protocol could be initiated only after a Logon and it could potentially extent into a Logout, if Logout quickly follows Logon.

            This protocol requires threads cooperation in a two-threads implementation.

           

 

Logout Protocol

 

            Any party could initiate a Logout Protocol. It is recommended to start it with Test Request Protocol. After a Heartbeat message is received, a Logout message is sent.  The responding party either replies immediately with its Logout message or finishes Gap Filling protocol, before it sends its Logout message.

            FIX Protocol requires initiation of Gap Filling Protocol by the acceptor immediately after receiving Logout message, if it is needed.

Hence, an initiator of the Logout Protocol should wait for a Logout response before shutting down a connection.

            Unfortunately, this protocol is often violated, either because an application closed the connection as a result of a crash (or improper shutdown procedure), or because an initiator closed the connection immediately after sending the Logout message. Such protocol violation could cause problems during consequent Logon Protocol (Gap Filling is one of them).

Specifics of Implementation

 

            This protocol requires threads cooperation in a two-threads implementation.

 

Logon Protocol

 

            Numerous problems could arise during execution of a Logon Protocol. They are associated with

It is executed at an initial session establishment (with sequence numbers set to out=1 and in=1) and at reconnect (using stored sequence numbers). 

An initiator sends a Logon message with HeartBtInt specified in it. The acceptor replies with its Logon message with copy of HeartBtInt in it as an acknowledgement.  An initiator should wait for the Logon reply message.

If a sequence number in a Logon message is smaller than expected, then FIX Protocol requires termination of the session. It is recommended to report the reason of such termination to another party before actually closing the connection. Such report could be in an initiator’s Logout message (reply with Logout and close connection), or in acceptor’s Logon message (reply with Logon and close connection). While these messages consume sequence number, they could be very helpful in event investigation.

If a Logon message carries a larger sequence number, than expected, a receiving party initiates Gap Filling Protocol. Two Gap Filling Protocols could be executed simultaneously initiated by an initiator of the Logon Protocol and its acceptor.

To mitigate such complexity, it is recommended to inject waits immediately after exchange of Logon messages before sending application messages.

For example:

  1. wait for awhile
  2. if there is no incoming messages, initiate Test Request Protocol
  3. wait until Test Request Protocol completes
  4. start sending own messages.

 

Specifics of Implementation

 

            Measures should be taken to prevent two ResendRequest messages crossing each other, because this could be poorly implemented at least at one side of a session. It should contain configuration parameters, which allow adjustment to existing protocol implementation on the other side.

As an additional precaution, it is recommended to have implementation of Gap Filling Protocol with storing of messages received out of sequence in a temporary queue instead of discarding them. This prevents requesting of them again after Gap Filling Protocol is executed.

This protocol requires threads cooperation in a two-threads implementation.

 

 

Forced Resetting of Sequence Numbers

 

A message SequenceReset with GapFillFlag=N forces ignoring of the sequence number with which it arrived and resetting stored sequence number to a new value.

 

Notes

 

This could be used for disaster recovery or for testing-debugging.

This should not be a part of normal operation, because messages could be lost with it.

This cannot be used instead of Logon message; hence it cannot help with low sequence number in a Logon message.

An initiator of this protocol could reset sequence numbers only in traffic, which it is sending.

 

 

Sequence Numbers

 

 

            Sequence numbers are maintained separately for both message traffics and they persist in spite of disconnections of underlying communication (TCP/IP). Guarantied message delivery is constructed on them and any forceful resetting of them opens possibility to message loss. There is no automatic resynchronization of sequence numbers when a number sent is lower than expected. If it is higher than expected, resynchronization goes via Gap Filling Protocol.

 

Storing Sequence Numbers

 

            With possibility to recover missed messages, it is logical to store a sequence number before a message is sent. This way, if the other party does not receive the message, it could be requested by its sequence number. Same logic works at a reconnect: in the worst case, the Logon message could have a larger sequence number, than one expected by an acceptor.

Note that this way of storing of sequence numbers leads to an important conclusion: When Logout Protocol was not properly executed (crash, improper shutdown, etc.), the Gap Filling Protocol could be initiated, even when there were no missed application messages.

 

 

Amnesiac

 

            After a crash, an application could have no messages stored to resend at Resend Request. After reconciliation via an external to FIX Session protocol (manual), it could use SequenceReset with GapFillFlag=N to synchronize sequence numbers in that traffic.

 

Regular Resetting of Sequence Numbers

 

            FIX Session Protocol recommends resetting of sequence numbers in 24 hours cycle. This is a procedure governed by a protocol external to it.

It is possible to reset sequence numbers via sending a Logon message with ResetSeqNumberFlag set. Fix Protocol describes its execution:

 

When using the ResetSeqNumFlag to maintain 24 hour connectivity and establish a new set of sequence numbers, the process should be as follows.  Both sides should agree on a reset time and the party that will be the initiator of the process.  Note that the initiator of the ResetSeqNum process may be different than the initiator of the Logon process. One side will initiate the process by sending a TestRequest and wait for a Heartbeat in response to ensure of no sequence number gaps.  Once the Heartbeat has been received, the initiator should send a Logon with ResetSeqNumFlag set to Y and with MsgSeqNum of 1.  The acceptor should respond with a Logon with ResetSeqNumFlag set to Y and with MsgSeqNum of 1.  At this point new messages from either side should continue with MsgSeqNum of 2.  It should be noted that once the initiator sends the Logon with the ResetSeqNumFlag set, the acceptor must obey this request and the message with the last sequence number transmitted “yesterday” may no longer be available.  The connection should be shutdown and manual intervention taken, if this process is initiated but not followed properly.

           

However, it is better to combine Resetting of Sequence Numbers with a refreshment of sessions, which accomplishes other tasks also, as picking-up of new network and security configuration, rearrangement of connections to facilitate load balancing, etc.

 

External Protocol: Resetting Sequence Numbers

 

            There are cases, when resetting of sequence numbers is safe, for example, when it is known, that there were no application messages for extended period of time. In this case, simultaneous resetting of sequence numbers could be done on both sides of communication simultaneously.

            Sometimes, this has to be done, because the underlying communication session was interrupted (socket closed, crash, etc.) during such period with no application FIX messages, and after it is restored, there is a mistake in sequence numbers, which does indicate a problem (crash, improper handling of shutdown, bug, etc.), but does not indicate that it is possible to lose an application message in FIX Session.

            This is done via:

 

1.      Blocking of connection attempts

2.      Logout (if needed)

3.      Termination of communication

4.      Simultaneous setting of sequence numbers to 1,1 by both parties

5.      Restart of communication

6.      Logon

 

            This protocol is often implemented in communicating applications and it is initiated manually through administration interface.

 

Buffered Data

 

            Messages could be buffered either in an application or in some intermediate relay. This could create a problem with Gap Filling Protocol. The protocol should ignore all buffered data after Gap Fill is requested (if it does not store messages with high sequence numbers).

 

 

Resend Request with High Sequence Number

 

            ResendRequest is granted even when it comes with high sequence number. This simplifies handling of a difficult case, when both parties request resend at logon.