A Simple Network Throughput Model
The network design function typically involves such activities as planning the physical connectivity, determining the performance and capability of the network, sizing the links and nodes, and calculating installation and operational costs. Some of these tasks are straightforward and some, such as performance determination and link sizing, involve mysterious calculations and guess-work. Determining the time it takes to transmit a file between two computers across a network, the amount of bandwidth needed to connect two sites, or the number of T1’s required to handle VOIP (Voice Over IP) traffic to the carrier’s network are problems that can be solved if one knows some basic network throughput concepts. This newsletter will show you a concrete approach to network performance determination and describes a simple throughput estimation model that has many applications.
The transfer of information over a data communications link involves the use of several categories of rules or protocols that ultimately result in the creation of a frame of data that is moved across a link between two intelligent machines – ones that understand the protocols. So the frame becomes an important parameter whose size and composition is dependent upon the protocols used. For example, an Ethernet frame is about 1500 characters long at its maximum and carries data as well as control characters generated by the protocols in use – Ethernet, TCP (Transmission Control Protocol), IP (Internet Protocol), etc.
Building a Simple Model
The number of frames sent will depend upon the frame’s data carrying capacity and the total amount of data to be transferred for the event being considered, the size of the file or screen image as an example. I’ll use the term Data Unit (DU) to mean the actual “file” data or event data being transferred. You may hear the term Protocol Data Unit (PDU) used in a similar context but this may also contain some protocol control characters. A frame will take time to move across the link, the amount of which will mostly be dependent upon the total frame size and the speed of the link with extra delay incurred at times for protocol interactions and intermediate machine handling.
A very simple model says: The transfer time of the file (T-file) is equal to the transfer time for one frame (T-frame) multiplied by the number of frames required to move the entire file – the number of characters in the file divided by the number of characters in the frame’s data unit (DU). As we are going to see later, there are more considerations that can be included in this calculation, which will improve the accuracy of this small model.
Adding Detail to the Model
Let’s now take a more detailed look at the components of a simple file transfer example in order to determine the time it takes to transfer it. In the diagram below, the laptop sends a large file to the server across a network link, and for simplicity’s sake it will be a slow, point-to-point wide area network link using a basic, half-duplex frame level protocol. Half duplex means that the sender (laptop) waits to get the previous frame’s acknowledgement (small protocol-only frame) from the receiver (server) before sending another frame to the server.
The time flow would involve time events per transmitted frame:
Next, we need to put numbers to these time events. The size of the frame to be transmitted will depend upon the protocols used on the link between the computers and the data unit. The data unit itself may not always be filled to its maximum defined value. In this example, the 50 overhead characters carried in each frame represent the protocols’ control characters. The maximum data unit value will be 500 characters and, because this is a file transfer application, it is assumed that the data unit will always be filled. This means that a complete data frame would be 550 characters in length.
Also in this example, the ACK frame, which carries only protocol overhead, will be 50 characters in total. Finally, the size of the file to be transmitted is 100,000 characters, the code set is 7-bit ASCII, laptop pressing delay per frame is .02 seconds (a guess), the server processing time per frame is .01 seconds (another guess) and the line speed under consideration will be 128K bps. Now, let’s crunch some numbers.
I like to convert everything into bits, although one could do this exercise using characters just as efficiently. Therefore, 550 characters, as an example, would convert to 3850 bits (550 characters multiplied by 7-bits per character assuming the use of ASCII-7 code).
How long does it take to deliver a frmae? This would be the sum of the time events 1-4 around the transmission cycle as follow:
The next step is to determine the time it takes to transfer the entire file, the size of which is 100,000 characters. As each frame can hold 500 characters in its data unit, the total number of frames required would be 200 frames (100,000 divided by 500). As each frame takes .0603 seconds, the entire file will be transmitted in 12.06 seconds (200 frames times .0603 seconds per frame).
Throughput is a general performance term and in networking is used to define the real data transfer capability of a system. A starting formula to express throughput would be: the useful information that needs to be delivered between end-points divided by the time it takes to deliver it. For example, the “Useful Information Delivered” would be an entire file, and the “Time It Takes To Deliver It”, as in this example, is 12.06 seconds, resulting in a file throughput of one file in 12.06 seconds.
We could also use throughput to define the number of files that can be transferred per hour (in this case 298), the number of packets per minute, the number of transactions per hour or the number of airline tickets per day, the latter would involve adding agent-processing time to screen transfer time per ticket. Throughput is a good metric if understood and used properly. We will see later how the term bits-per-second (bps) used to describe throughput of a telecommunications circuit is different than the bps operating rate of the circuit.
This appears to be a lot of work to determine the transfer time of the file and some would think that a more direct approach would be to simply divide the total bits in the file by the line speed. Doing this would yield an answer of 5.5 seconds (100,000 file characters times 7-bits per character divided by 128,000 bps line speed). Why is this answer less than half the time we calculated? Our throughput figure takes into account time delays and overhead characters, which in this case more than doubles the amount of time it takes to transfer the useful data.
Determining Line Throughput
Another way to look at this problem is to calculate the line throughput, or the effective throughput of the transmission line, which isn’t 128K bps. An easy way to come at this is to go back to the single frame time calculation. It took .0603 seconds to deliver 500 useful characters or 3500 bits for a line throughput of 58K bps (3500 bits divided by .0603 seconds, or the useful data divided by the time it takes to deliver it, equals 58,043 bps). Again, the line throughput is less than half the line’s operating rate. This number can also be used to determine the transfer time of the entire file as follows: 100,000 characters times 7 bits per character divided 58k bps line throughput or 12.06 seconds to transfer the file.
Increasing the Model's Accuracy
To be even more accurate, there are other concepts to take into consideration – error rates and line propagation delay. Both of these factors will influence the file throughput in a negative way.
Error rate means that a percentage of frames will become corrupted or dropped during the transmission process with the actual rate being dependent on the link conditions and type. Noise on the line or congested routers are primary factors that produce corrupter or dropped frames. When a frame is corrupted or dropped, the data communications protocols in the receiver, at some level, will recognize the error condition and notify the sender via the acknowledgement process. This causes the sender to retransmit one or more previous frames, depending upon the protocols’ structure. Additional frame transmissions will simply add to the overall time it takes to complete the entire file transfer, with each frame retransmission adding .0603 seconds in our example. An occasional error is not a big problem but repeated errors caused by high noise levels or congestion can add up resulting in an increase in the file transfer time which leads to a lowering of throughput.
Propagation delay is another factor that can also lower throughput. Electrical propagation delay is simply the time it takes for the electrical signals, which physically represent the logical bits, to travel over wire from the sending unit to the receiving unit. This factor is primarily distance dependent and can add .030 to .075 seconds to a terrestrial coast-to-coast transmission and .250 to .400 seconds to a satellite transmission one-way. In our half-duplex example above, these times would be added to both the data frame and acknowledgement transmission times. In this example, the resulting line throughput would be reduced to 30K bps.
In addition to raw electrical propagation delays, there may be intermediate nodes, such as routers or data switches, in the transmission path that also add frame handling delay if the “link” is other than a point-to-point line. For example, one could use this model to determine the file throughput between the laptop and the server connected over the Internet. In this case, the data frame and the acknowledgement transmission times, time events 1 and 3, would comprise the time to traverse the Internet. A good application of this forecasting technique would be to compare the effect on throughput using the Internet versus using a point-to-point line. To do this, one would first gather some test data that measures time delays across the Internet between planned points, taken during peak usage periods in order to determine the worst case transmission delays. A cost-throughput analysis can then be performed using these delays in the model calculation.
This simple model can also be used to size the link by changing the link speed component in the model and recalculating the throughput. Usually, the faster the link speed the more it costs and a cost-throughput analysis can then be performed.
The impact of the processing delays at the laptop and server ends can also be studied. In the above example, a “guess” was made and values were plugged into the model. Real values will depend upon the computers used, how the file transfer applications and transmission protocols are structured and how busy these machines are during the file transmission process. For more on the impact of busyness, see the newsletter “Applying Queuing Theory to Network Design”.
Other Model Uses
Finally, one could investigate the performance behavior of changing the frame-level or other protocols. The simple model we have been using depicts the activities of a half-duplex frame level protocol, which requires the sender to wait for each positive or negative acknowledgement before sending the next frame. Full duplex frame-level protocols, such as PPP (Point-to-Point Protocol) or HDLC (High Level Data Link Control), do not require waiting for the acknowledgement of every frame as one acknowledgement can cover many successful frames, in some cases up to 128. The retransmition of corrupted or lost frames is performed by interspersing them out of order with other outgoing frames. The receiver’s protocol software reorders the frame sequence before they are used by the application. This capability results in a more efficient flow and a higher throughput is achieved as the acknowledgement leg is drastically reduced and the sender waits less between outgoing frames.
With a little thought, the impact of more complex protocol arrangements can be studied by changing or adding to the model’s sequences. Frame level protocols, such as PPP and HDLC, operate between two machines, computer to computer or computer to router for example. Higher level protocols, such as TCP, operate between end-to-end devices over an IP network (many routers in between). TCP invokes an end-to-end packet (like frame) flow control and a slightly different end-to-end error control mechanism, which would be in addition to the machine-to-machine frame-level protocols occurring between IP routers. Using this model, one could assume a constant transmission delay through the router-based link across the IP network and investigate the impact of TCP flow control and error control procedures under various circumstances on file throughput. By eliminating the effects of the end-to-end TCP acknowledgements, VOIP traffic can be modeled as well. VOIP uses UDP (User Datagram Protocol), which does not perform end-to-end error checking and does not send end-to-end acknowledgements.
The Benefits of a Simple Model
A basic understanding of data communications protocols and frame transmission procedures organized around a simple time model can provide insights into a variety of network performance issues. Often, this is all that is needed in order to roughly size link speeds or answer high-level planning concerns. Through this exercise, you can now understand how by using efficient full duplex protocols, large data units in the frame, fast transmission links, fast processors, and having few intermediate frame handling devices a network designer can achieve line throughput levels of 75% - 80% of the line's operational bps rate.
These types of tips and insights can be found in all training classes provided by McGuire Consulting. The high quality nature of these courses is based on the many years of work and training experience of the author, Jay D. McGuire.