Netrek UDP Enhancements - General Information
By Andy McFadden (fadden@uts.amdahl.com)
Dual channel model, version 1.0
Updated: 06-Apr-92


(this document assumes you are somewhat familiar with the UDP client from a
player's standpoint, or have at least read the client documentation.)

Goals
-----

- A protocol which:
    guarantees the reliability of certain packets;
    allows switching on demand from TCP-only to UDP/TCP and back;
    won't hang or cause abnormal termination if a UDP packet is lost.

- A simple user interface.

- Compatibility with existing (i.e. non-UDP) clients & servers.


Implementation overview
-----------------------

The client will have a new menu with UDP features.  The client interface
is described in a separate document.

The server will have a new .sysdef option:

UDP=0		disallow UDP connections
UDP=1		allow UDP connections
UDP=2		allow connections, print debugging info
UDP=3		allow connections, print VERBOSE debugging info

netrek/socket.c and ntserv/socket.c have been modified extensively (netrek/
socket.c is 840 lines longer, and ntserv/socket.c is 1050 lines longer).
Packets travelling across the TCP and UDP connections will be treated equally
once they have been read; there will be no distinction between packet types
after they are sent.

When UDP mode is enabled, all transmissions (defined as a single gwrite()
call) will include a sequence packet at the start.  If the sequence # is too
low, then the rest of the transmission on that channel will be dropped until
the next sequence packet is seen.  The big exception is that TCP packets are
NEVER ignored, since they are assumed to include critical information.

Because of the possibility of varying rates on TCP and UDP channels, I have
choosen to ignore the sequence number in the TCP packet (in fact, I don't
even send one).  The two kinds of packets arrive on different schedules, so
trying to maintain synchronization is impossible.  Note that the sequence
numbers are reset to zero each time the UDP connection is established (this
way closing & reopening the connection will fix problems that arise if the
client & server become hopelessly out of sync).

TCP packets will continue to be placed into a buffer roughly 16K in size.
UDP transmissions will be cut off at 960 bytes (this is adjustable in the
server).

If the player decides that the loss of incoming packets is acceptable but
the outgoing packet loss is intolerable, TCP send mode can be selected (which
simply directs the client to send all outgoing packts down the TCP channel,
regardless of packet type).  Receiving with TCP while sending with UDP doesn't
make much sense, but it is included for completeness.


Extended send & receive
-----------------------

For increased reliability, the client has extended send and receive options.
In the client, commands which don't appear to have been accepted can be
repeated automatically.  This is called "forced UDP".  In the server,
semi-critical information can be resent automatically in a separate packet or
at the end of a short packet (which increases traffic and CPU load but
increases the probability that the information will get through).

For "double UDP", the resend works by sending just the semi-critical packets
from the previous transmission, giving it a different sequence packet with
the same sequence number.  If the original transmission is received, then the
sequencing mechanism will automatically drop the semi-critical transmission.
If not, then the second transmission will be processed normally.

Because this protocol may do more harm than good, I have left it as a
compile-time option.  I did this rather than make it a sysdef switch because
it was easier and because I have a suspicion that it will never be used again
anyway...  I doubt most servers will want to support it.

Another extended receive option is "fat UDP", in which packets under 500
bytes are padded with an extra 90-100 bytes of semi-critical state info.
The info will be added on a "least recently sent" basis, adjusted by how many
times that particular packet has been sent since it was last modified.  The
extra load on the network and CPU should be minimal.  See the section on
"Customizing Fat" later on.

When all else fails, an "update all" request is provided.  Issuing this
request will cause all semi-critical and some non-critical data to be sent as
part of the next update.  This also causes fat UDP to reset itself.


Critical/non-critical packets
-----------------------------

There are three levels:
    0 - critical, must get through
    1 - semi-critical, will confuse player if it doesn't arrive
    2 - non-critical

For the server, two criteria were used:
- will bad things happen if the packet doesn't get through?
- is it something which ought to be reliable and doesn't happen very often?
    (ex: SP_MOTD)

For the client, the semi-critical packets were chosen based on which could
be detected or at least repeated without undue side-effects.  For example,
"det torps" is impossible to detect, and "fire torp" would often result in
firing more torpedos than was desired.  Some options (change speed, change
direction) are simply repeated twice for good measure because there's no
harm in doing so (besides sending more packets than necessary, that is).

Server critical:
 1 SP_MESSAGE
 2 SP_PLAYER_INFO
10 SP_WARNING
11 SP_MOTD
13 SP_QUEUE
16 SP_PICKOK
17 SP_LOGIN
19 SP_MASK
20 SP_PSTATUS
21 SP_BADVERSION
24 SP_PL_LOGIN
25 SP_RESERVED
26 SP_PLANET_LOC

Server semi-critical:
 3 SP_KILLS		<-- not important, but semi-confusing if this is lost
 5 SP_TORP_INFO
 7 SP_PHASER
 8 SP_PLASMA_INFO
12 SP_YOU		<-- sometimes this is non-critical or critical
14 SP_STATUS
15 SP_PLANET
18 SP_FLAGS
22 SP_HOSTILE

Server non-critical:
 4 SP_PLAYER
 6 SP_TORP
 9 SP_PLASMA
23 SP_STATS
27 SP_SCAN	(Amdahl scanning beams)
28 SP_UDP_REPLY	(UDP packet; only kind sent in UDP mode is VERIFY packet)
29 SP_SEQUENCE

Client critical:
 1 CP_MESSAGE
 8 CP_LOGIN
 9 CP_OUTFIT
10 CP_WAR
22 CP_COPILOT
27 CP_SOCKET
28 CP_OPTIONS
29 CP_BYE
31 CP_UPDATES
32 CP_RESETSTATS
33 CP_RESERVED
35 CP_UDP_REQ	(uses a special case to send the VERIFY packet through UDP)

Client semi-critical (forced mode):
 2 CP_SPEED
 3 CP_DIRECTION
 4 CP_PHASER
 5 CP_PLASMA
12 CP_SHIELD
13 CP_REPAIR
14 CP_ORBIT
15 CP_PLANLOCK
16 CP_PLAYLOCK
17 CP_BOMB
18 CP_BEAM
19 CP_CLOAK
23 CP_REFIT
24 CP_TRACTOR
25 CP_REPRESS
30 CP_DOCKPERM

Client non-critical:
 6 CP_TORP
 7 CP_QUIT
11 CP_PRACTR
20 CP_DET_TORPS
21 CP_DET_MYTORP
26 CP_COUP
34 CP_SCAN	(Amdahl scanning beams)
36 CP_SEQUENCE	(not used)


Customizing fat
---------------

Initially, all semi-critical packets have a corresponding "fat" node, which
are held in arrays parallel to the ones that hold the packets.  When a
packet is sent, its fat node is added to "temporary" queue 0.  When the
entire transmission is sent, the fat nodes in the temp queue are added to
the tail of the "real" fat queue.

At the end of the next transmission, the nodes in the fat queues are added
to the transmission right before it's sent.  If the fat queue is empty, the
server will check the next higher queue (queue 1, queue 2, etc).  When a
packet is used for fattening, it's moved to the next higher temporary queue.
(The reason for the temporary queues is so that the same packet doesn't get
sent several times in the same transmission).  Eventually, the packet will
either get re-sent (and be moved back down to queue 0) or will be resent
several times and fall off the edge of the highest queue.

So, packets that were just sent will be used for fattening immediately.
During slow periods, packets from progressively higher queues will be resent.
Eventually the packet will fall off the highest queue and never be resent.

You can tweak several things, including:
- the size of UDP packets (currently 960; could go up to 1400 or so)
- the maximum packet size eligible for fattening (currently 500)
- the maximum #of bytes to add (currently 100; don't drop below 60)
- The minimum #of bytes to add to the transmission (currently 90).  This
  exists so that the server doesn't waste a bunch of time scanning through
  lists trying to find a 3-byte packet.
- the number of fat queues (now 5).  This determines how many times a packet
  will be resent before it is forgotten.  Optionally, you can have packets
  remain on the highest queue indefinitely, where they will be resent
  perpetually in FIFO order.  However, this should be unnecessary for most
  connections.

Note that only semi-critical packets are eligible for fattening.  Note also
that a packet only gets fat-queued if it has been sent when fat is enabled,
so switching to fat will not automatically clear up a display.  This is why
a full update is sent when fat UDP is enabled (to keep the users calm...)
Note also that requesting a full update will cause all packets to be moved
to queue zero.

This method is not without minor problems.  There were glitches involving
torps that exploded every time the next fat update arrived, or that took
a very long time to go off because the timer kept getting reset.  These
were fixed with some if statements in the client.


UDP performance and side effects
--------------------------------
It should be noted that setting the update rate to 5 frames/second does NOT
mean that the server only sends five transmissions per second.  It specifies
a MINIMUM transmission frequency; the server will send regardless when its
buffer fills.  Under TCP this is about 16K, so even at 1 update/second you
are unlikely to receive any information early.

Under UDP the buffer size is 960 bytes, so you may receive smaller packets
more often.  It is quite possible to have the frame rate appear to increase
when you move into the thick of a battle.  This does NOT mean that the server
is sending more data; it's merely spreading the same data over several
packets.  It also does not mean that your display will suddenly speed up; in
fact, most people used to 5 updates/second say that the display appears to
slow down at 9 updates/second.  The smoother display which (accidentally)
results will most likely be beneficial to most people.

Some quick tests showed that, even on a busy system, packets rarely exceed
700 bytes.

(in case I've managed to hopelessly confuse you, think of it this way: the
server will still be sending updates at the same rate, but some will arrive
a few milliseconds earlier than the others.  Your client will get some data,
redraw, get more data, redraw, ...  In most cases, I wouldn't expect to
receive more than two packets per update, so it shouldn't be a major issue.
That remains to be seen, however.)


Socket details
--------------

A typical way to connect is:

- server opens a port with bind(), and passes the number to the client via TCP
- client opens the UDP port, does a connect(), and sends some data via UDP
- server does a recvfrom(), which supplies the client address and port
- server does a connect()

A two-way UDP connection is now established.

Under UTS 2.1 however, recvfrom() is broken, and doesn't always fill in the
"from" parameter (the whole reason for the client sending data to ther server
before the server connects is so that the server can identify the client's
host and port).  Since I'm writing and debugging this under UTS, the client
packet must contain the client's UDP port number.  (note that UTS product
support has been made aware of the problem, and is working to fix it.)

So, the connection actually works like this:
- client opens a port with bind(), and passes the port number to the server
- the server does a connect(), and passes it's port number back to the client
- the client does a connect() to the server's port

This sort of thing is a partial justification for the complicated UDP
connection process (explained below).

However, this makes it very complicated for people running a client through
a gateway machine to get a UDP connection.  So, I have left the recvfrom()
code in the client (#ifdefed out), and added some comments and #defines for
gateway stuff.  The client will specify which kind of response is desired
when it sends the initial UDP request packet, so the decision to use or not
use recvfrom() is strictly within the client.

(The reasons for leaving the recvfrom() code #ifdefed out are: (1) you don't
gain anything unless a gateway is involved, (2) with the "port passing"
method there's absolutely no interruption in game play while the protocol
switches, and (3) the client will hang briefly in recvfrom() if the server
doesn't support UDP.  If you prefer recvfrom(), go ahead and edit the
defines.)

Incidentally, if you want to see the connection process in slow motion, set
your update rate to one per second and open the UDP channel.


Source code modifications
-------------------------

Most of the changes were restricted to socket.c, packets.h, defs.h, and
data.c/data.h in both client and server.  Some minor fixes had to be added
to deal with ghostbusting correctly (we can assume that the UDP link will
be severed, so both sides will revert to TCP automatically), and the UDP
socket had to be added to the select() call in netrek/input.c and ntserv/
input.c.  And, of course, a new netrek/udpopt.c file had to be added to
support the UDP options menu.

On the whole, integrating the UDP code into an existing server should not
prove to be a major chore.


Performance
-----------
The impact of UDP transmissions on the network and CPU is not known.  I
suspect the load is lower than that incurred by TCP, but I have no supporting
evidence.

One thing to be wary of is the increasing number of users who will be able
to play at nine (or ten) updates/second.  Because of this, the network and
CPU load can increase dramatically...  It may be necessary to limit the
maximum #of updates/second to ensure decent performance for all players
(see handleUpdatesReq() in ntserv/socket.c).


Explanation of UDP connection process
-------------------------------------

This is somewhat involved, because there are some nasty failure states:
- ntserv using UDP, client ignoring updates
- client using UDP, server ignoring messages
- ntserv and client both decide to use UDP, but server can't connect
- client stutters and sends a second request, causing server to reroute data
  ...

Most failures end up with server and client ignoring each other, forcing the
player to disconnect.  Might be worthwhile to add a "force reset" to make
the server switch back to the TCP line.  If this protocol does its job, then
that shouldn't be necessary.

(A state diagram would be better, but harder to interpret.  This is woefully
incomplete, but should impart a reasonable understanding of what's going on.)

COMM_XXX indicates where the client or server expects to send/receive data
STAT_XXX indicates the "state" of the client


Initial state:
Client:	COMM_TCP, STAT_CONNECTED
Server:	COMM_TCP

C finds a free port and bind()s it
C sends a CP_UDP_REQ(COMM_UDP, client_port)

Client: COMM_TCP, STAT_SWITCH_UDP
Server: COMM_TCP

(if the server doesn't send a response within 25 updates, figure the server
doesn't know about UDP and reset to STAT_CONNECTED.)

S checks it's current mode:
	if already UDP, drop current connection and proceed below
if S isn't allowing UDP connections, it sends SP_UDP_REPLY(SWITCH_UDP_DENIED,0)
S opens a UDP socket, connect()s to the client and sends a
  REPLY(SWITCH_UDP_OK, server_port) over TCP
  - or -
  REPLY(SWITCH_UDP_OK, 0) over UDP          <-- if recvfrom() is used

Client: COMM_TCP, STAT_SWITCH_UDP
Server: COMM_TCP

C tries to connect() to UDP port (from message or result of recvfrom())
	if it fails, it resets state and sends REQ(COMM_TCP,0)
C sends REQ(COMM_VERIFY, 0) through UDP connection
	if it times out, C will reset state and send REQ(COMM_TCP,0)

Client: COMM_UDP, STAT_VERIFY_UDP
Server: COMM_TCP

S gets the verification message and begins sending data down the UDP link
First packet is a REPLY(SWITCH_VERIFY,0) (to guarantee that something does
  in fact get sent across the UDP connection).  If the reply gets lost, and
  nothing else gets sent via UDP for a while, the client will time out and
  reset.  If recvfrom() is used, then this VERIFY is simply ignored.

Client: COMM_UDP, STAT_VERIFY_UDP
Server: COMM_UDP

C gets an update on the UDP line

Client: COMM_UDP, STAT_CONNECTED
Server: COMM_UDP

[C can now begin sending data on the UDP line]

--- now switch back to TCP ---

C sends a REQ(COMM_TCP,0) [and stops sending data on the UDP line]

Client: COMM_UDP, STAT_SWITCH_TCP
Server: COMM_UDP

S checks it's current mode:
	if it's already TCP, send REPLY(SWITCH_TCP_OK,0) and do nothing further
S closes its UDP socket, resets its mode to COMM_TCP, and sends a
  REPLY(SWITCH_TCP_OK, 0)

Client: COMM_UDP, STAT_SWITCH_TCP
Server: COMM_TCP

C closes its UDP socket, and changes mode

Client: COMM_TCP, STAT_CONNECTED
Server: COMM_TCP


Future Direction
----------------

Possible future enhancements (for someone else to do):
- add defaults to .xtrekrc file
- add auto-attempt-UDP
- have fat UDP adjust itself based on the packet loss rate


This design was built as a prototype for a UDP client and server.  There
are several major flaws with it, among them:

- continued dependence on TCP
- unreliability of semi-critical packets
- wasted bandwidth for unnecessary duplicate transmissions (double UDP,
  enforced UDP send, fat UDP)
- lots of kludges to make it seem like a reliable connection

A better scheme is to throw away the existing socket.c and packets.h and
start all over, designing a UDP-only protocol which clearly delineates
which packets are critical and which aren't, and attempts to minimize the
amount of data which is considered critical.

Such a scheme could be integrated with the existing client and server by
having separate code compiled in.  When the client wants to switch, instead
of sending the same kinds of packets over a different channel, a completely
different set of packets would be sent solely over the UDP channel.  This
would allow the same sort of user interface as the client and server I have
designed, but would be more efficient and reliable.

I haven't the time or the desire to pursue this course, however, and so
I will continue to maintain the dual-channel design until something better
comes along.  Interested parties should take a look at what Ray Jones is
planning to do.

One small warning: this scheme is not without its pitfalls.  It may yet prove
to be unworkable.


Credits
-------

Tedd Hadley brought up the first non-Amdahl server with UDP.  Terence Chang
added UDP to his server on bronco, giving me a horrible network connection to
play with.  Both of them deserve a great deal of credit for adding the UDP
code to their clients and servers several times as the code was being
developed, and for finding a healthy pile of bugs.


I'd like to thank Kevin Smith for providing some sample UDP code, for
providing feedback on the initial design, and for suggesting fat UDP.  Brian
Paulsen suggested the "auto-command-resend" stuff.  Steven Janowsky suggested
having "semi-critical" server packets, though my initial attempt at
implementing it wasn't too great (double UDP).

And, of course, a few dozen alt.games.xtrek readers acted as guinea pigs
and provided feedback.


Closing Notes
-------------

I think it should be mentioned at some point that "UDP" is just an
assumption.  We are using *datagram* sockets, not UDP sockets; UDP just
happens to be the protocol that everybody on the planet is using for their
datagram sockets.  It is, however, easier to write "UDP" 500 times, so I
chose to stick with it.

That's all, folks...