Distributed Internet Servers by Ian Kallen

Contents
Introduction
As the Internet's growth charges ahead at a phenomenal pace and networks opt to share traffic only with private peering agreements in place, distributing an Internet server's load to multiple machines becomes increasingly imperative. It is my hope that this resource can be developed into one that is comprehensive in it's provision of information on technology and applications related to these needs.

Distributing content in a way that minimizes the network latency that end users must endure is highly desirable for content providers. While distribution that is network topology based such as Cisco's Distributed Director and IBM's Interactive Network Dispatcher sound attractive, a large portion of Internet end user traffic comes from large networks such as AOL (via ans.net), Netcom, att.net, uu.net and CompuServe - their network performance is subject to router backhauling and NAP configurations. The benefits of content distribution under such a regime are unclear at this time.

Articles that discuss these issues: TOP
Load Balancing

Round Robin Load Balancing With BIND

This is excerpted from comp.protocols.tcp-ip.domains FAQ
Date: Wed Mar  1 11:04:43 EST 1995
Subject: Q4.10 - Distributing load using named

Q: If you attempt to distribute the load on a system using named, won't
   the first response be cached, and then later queries use the cached
   value? (This would be for requests that come through the same
   server.)

A: Yes.  So it can be useful to use a lower TTL on records where this is
   important.  You can use values like 300 or 500 seconds.

   If your local caching server has ROUND_ROBIN, it does not matter
   what the authoritative servers have -- every response from the cache
   is rotated.

   But if it doesn't, and the authoritative server site is depending on
   this feature (or the old "shuffle-A") to do load balancing, then if
   one doesn't use small TTLs, one could conceivably end up with a
   really nasty situation, e.g., hundreds of workstations at a branch
   campus pounding on the same front end at the authoritative server's
   site during class registration.

   Not nice.

A: Paul Vixie has an example of the ROUND_ROBIN code in action.  Here is
   something that he wrote regarding his example:

     >I want users to be distributed evenly among those 3 hosts.

     Believe it or not :-), BIND offers an ugly way to do this.  I offer
     for your collective amusement the following snippet from the
     ugly.vix.com zone file:

       hydra           cname        hydra1
                       cname        hydra2
                       cname        hydra3
       hydra1          a            10.1.0.1
                       a            10.1.0.2
                       a            10.1.0.3
       hydra2          a            10.2.0.1
                       a            10.2.0.2
                       a            10.2.0.3
       hydra3          a            10.3.0.1
                       a            10.3.0.2
                       a            10.3.0.3

      Note that having multiple CNAME RR's at a given name is
      meaningless according to the DNS RFCs but BIND doesn't mind (in
      fact it doesn't even complain).  If you call
      gethostbyname("hydra.ugly.vix.com") (try it!) you will get
      results like the following.  Note that there are two round robin
      rotations going on: one at ("hydra",CNAME) and one at each
      ("hydra1",A) et al.  I used a layer of CNAME's above the layer of
      A's to keep the response size down.  If you don't have nine
      addresses you probably don't care and would just use a pile of
      CNAME's pointing directly at real host names.

      {hydra.ugly.vix.com}
      name: hydra2.ugly.vix.com
      aliases: hydra.ugly.vix.com
      addresses: 10.2.0.2 10.2.0.3 10.2.0.1

      {hydra.ugly.vix.com}
      name: hydra3.ugly.vix.com
      aliases: hydra.ugly.vix.com
      addresses: 10.3.0.2 10.3.0.3 10.3.0.1

      {hydra.ugly.vix.com}
      name: hydra1.ugly.vix.com
      aliases: hydra.ugly.vix.com
      addresses: 10.1.0.2 10.1.0.3 10.1.0.1

      {hydra.ugly.vix.com}
      name: hydra2.ugly.vix.com
      aliases: hydra.ugly.vix.com
      addresses: 10.2.0.3 10.2.0.1 10.2.0.2

      {hydra.ugly.vix.com}
      name: hydra3.ugly.vix.com
      aliases: hydra.ugly.vix.com
      addresses: 10.3.0.3 10.3.0.1 10.3.0.2

SQL Driven DNS Manipulation

Dynamic "System Burden" Based Load Balancing

IBM's Interactive Network Dispatcher

Network Topologically Based Load Balancing

Cisco System's Distributed Director

Genuity's Hopscotch

HydraWEB

metainfo

ISI Network's ShortCut

See "IBM's Interactive Network Dispatcher" above as it also has a component that is network topology aware

Layer 4 Switching

These are switches that operate on a higher level; the 4th OSI network layer, thus 'layer 4 switching.' By being aware of the burden that connected devices are suffering from, layer 4 switches can intelligently decide which machine is best suited to fulfill a request.

Alteon Networks

Holontech Corporation

Geographically Based Load Balancing

Proxy caching front-end servers

TOP

Content Synchronization and Distribution

Many databases have facilities built-in for replication and synchronization, most file systems do not. For most high availability and high performance Internet servers, serving files from a common NFS mount offers little if any benefit. Most system load is due to disk subsystem I/O, distributing the load to more disk spindles and more disk controllers will offer better performance than centralizing these functions. Many implimentations of NFS also suffer from performance and security concerns. A possible solution to NFS' drawbacks is cachefs but this is only supported by a few platforms (I know of Solaris and IRIX, perhaps there are others I haven't heard of yet).

Some of the technologies that come to mind for content synchronization include integrity checkers such as tripwire and source control/distributors such as CVSup but adapting these systems for your Internet server's content may be cumbersome.

Performance Measurement

Inverse Network Technology
Keynote Systems

rsync

rdist

tripwire

CVSup

mirror

StarBurst Communications Corporation's Multicast

Windows NT Directory Replication

Remote Management

Bay Technical Associates
Remote power control, remote consoles & alarm reporting
dataprobe
Remote power control and alarm reporting
Rose Electronics
Remote control devices
Cybex Computer Products
Remote control and power control devices
Western Telematic
Remote control and power control devices
TOP

System Failover

TOP
© Ian Kallen
Direct any additions or corrections to the address above
Last modified