General Information
===================

bsdproxy is a generic, event-driven proxy designed specifically for the
BSD platform.  It uses the kqueue()/kevent() system calls to determine
when to relay data from one side of the connection to the other.  It also
uses GLib (http://www.gtk.org) data structures and memory management 
functions to optimize steady-state performance (minimize unnecessary 
memory allocation/deallocation).  

bsdproxy has been used to proxy HTTP, HTTPS, telnet, and mysql without
any problems.  It should be able to serve as a transparent proxy for 
anything over a TCP/IP connection.

There are three callbacks which may be executed in the main event loop:  
accept_client(), read_data(), and write_data().  Pending connections are
handled by accept_client().  The client is connected to the proxy, and
the proxy makes a connection to the server on behalf of the client.  
If connection on both ends is successful, read_data() is subsequently 
executed when the client sends data to the proxy over the connection.  
Data sent from the client to the proxy is queued in a singly linked list 
of character buffers.  Each time a buffer is appended to the queue, an 
event is triggered which results in write_data() being called a single
time as soon as data can be written to the server.  write_data() then 
sends the head entry in the buffer queue to the server.  read_data() is 
also executed when the server sends data back to the proxy.  it triggers 
write_data() to the client in the same manner that read_data() from the 
client triggers write_data() to the server.  

All I/O is done in nonblocking mode, so the callback functions have no 
latency.  The amount of time spent in each callback is minimized since
relatively little data is read or written in each function call.

The input and output queues allow the proxy to support a large number of
concurrent connections, and are especially useful for things like 
accelerating Apache.  Frequently, web sites' installations of Apache 
have many extra modules (such as mod_ssl, mod_perl, etc) built in, and
each httpd process is rather large (sometimes up to 20 MB).  During peak
traffic times, many httpd processes get tied up waiting for data to be 
sent to the clients, which live on the other side of a narrowband
connection.  This is costly because the httpd count grows until the server
starts to eat into swap space, and performance degrades drastically.  With
an accelerating proxy such as this, the task of managing I/O with the clients
is offloaded to the proxy, which takes up very little memory and does its 
job with only one process.  It is assumed, of course, that the proxy's
connection with the server is fast, so the server can fill up the proxy's
client output queues quickly.  The proxy can then empty the output queues
to the clients at whatever speed their connection with the proxy dictates.

As a test of this principle, we set up a bsdproxy to accelerate a webserver
via a 100 Mbit/s internal connection to the webserver machine.  A 1.5 MB 
file was downloaded five times in succession over a 1.5 Mbit/s DSL connection.
Some statistics about the session were recorded and are reproduced below:

[2001-03-22 15:24:16] Available file descriptors:    4136
[2001-03-22 15:24:16] Main event loop iterations:    6398
[2001-03-22 15:24:16] Events received in main loop:  9024
[2001-03-22 15:24:16] Event errors recorded:         0
[2001-03-22 15:24:16] Client connections accepted:   5
[2001-03-22 15:24:16] Bytes read from clients:       470
[2001-03-22 15:24:16] Bytes written to clients:      7449185
[2001-03-22 15:24:16] Bytes read from server:        7449185
[2001-03-22 15:24:16] Bytes written to server:       470
[2001-03-22 15:24:16] Maximum concurrent clients:    1
[2001-03-22 15:24:16] Time reading from clients:     81.344
[2001-03-22 15:24:16] Time reading from server:      0.633
[2001-03-22 15:24:16] Time writing to clients:       81.333
[2001-03-22 15:24:16] Time writing to server:        0.638
[2001-03-22 15:24:16] Acceleration factor:           128.475
[2001-03-22 15:24:16] ===> bsdproxy version 0.01 finished

The relevant number is the "Acceleration factor" of 128.475.  What this
means is that the webserver itself was able to write the data to the proxy
128 times faster than it would have been able to write it to the client.
This means that the apache processes running on the webserver were free 
to respond to other requests sooner than they otherwise would have been.

This does NOT mean that one's website will run 128 times faster from behind
a bsdproxy.  It simply means that in certain cases, bsdproxy can help take 
some of the load off of the apache processes (which frequently use a lot of
memory).  In those situations, under the same amount of website traffic,
there will typically be fewer apache processes running if apache is placed
behind a bsdproxy.  To some extent, this depends on the nature of the data
being served and the connection speed of the clients.  If large files are
being served to clients over narrowband connections, the effect of a 
bsdproxy should be significant.  If small files are being served to clients
over broadband connections, the effect will not be as great.


Todo List
=========

A number of changes would make bsdproxy a more interesting and useful 
program.

* Multiple proxy connections would be nice (i.e. for simultaneous proxying
  of HTTP and HTTPS).

* On-the-fly reconfiguration (i.e. starting and stopping connections to
  various servers through various ports).

* More useful statistics and monitoring information, made available over
  a secure socket to authorized users.

* Priority event queueing (i.e. pending connections from clients are 
  handled at a higher priority than data transmission to clients, so 
  that the proxied server still responds quickly under heavy load).

* Dynamically configurable constraints on client connections, such as 
  total connection time, maximum idle time before disconnect, maximum
  data that can be uploaded by a client, maximum bandwidth allocated 
  to each client, and so forth.

