Distributed Load Balancing
We've been focusing on local load balancing but the Web is a distributed
system with clients all over the world. We should be concerned with the
globally dispersed marketplace and audience.
All the best local load balancing and failover capabilities won't do you
a darn bit of good if the data center hosting your server farm suffers from
a facilities outage.
It is common to think of the global load balancing problem as a geography
problem. There has even been discussion and perhaps an RFC or two about
geographically contextual DNS. The problem is more than one of geography,
it is one of network proximity. We care most about giving the
end user the most appropriate server as measured by actual throughput they'll
enjoy (or tolerate, as the case may be) from where they are in relation to
where our servers are on the network.
While one can marry what we know of routing tables or observed throughput
characteristics to our DNS server, the approach taken by many of the
distributed load balancing vendors, this will give the end user's the most
appropriate server based on that user's DNS server -- which may not
necessarily networkly proximate to the end user either!