Technology

ISC DHCP Failover Configuration

Internet Systems Consortium Logo

As a follow-up to the Install ISC DHCP on macOS Catalina post, I wanted to create a quick how-to on setting up ISC DHCP failover…because having at least two DHCP servers is a good practice. It WILL save your behind…at some point you are going to fat-finger something in the DHCP configuration, or you will want to do maintenance/updates on the servers, and dhcpd will stop working. Having DHCP failover setup will keep things running, and your users blissfully unaware while you find and fix your typo, or performing maintenance/updates.

Requirements

ISC DHCP server running on two servers. ISC highly recommends both running servers are running the same version of DHCP. If not, you may run into issues and unexpected behavior. It’s not hard to have the same version running on both, so just do that…you’ll have fewer headaches in the long run.

Configure Failover

Creating failover ISC DHCP servers is pretty simple, as there are really only two things that need to be in place.

Edit the /etc/dhcpd.conf file on both DHCP servers, to let them know they are failover for each other.

Start with the primary DHCP server, open dhcpd.conf, and add this bit of configuration…

# DHCP Failover Configuration - PRIMARY
failover peer "dhcp-failover" {
     primary;                    # Declared Primary DHCP Server
     address 10.0.1.10;          # Primary DHCP Server IP
     port 847;
     peer address 10.0.1.11;     # Secondary DHCP Server IP
     peer port 647;
     max-reponse-delay 60;
     max-unacked-updates 10;
     mclt 3600;
     split 255;
     load balance max seconds 5;
 }

Next, go to the secondary DHCP server, open dhcpd.conf, and add this bit of configuration…

# DHCP Failover Configuration - SECONDARY
failover peer "dhcp-failover" {
     secondary;                  # Declared Secondary DHCP Server
     address 10.0.1.11;          # Secondary DHCP Server IP
     port 647;
     peer address 10.0.1.10;     # Primary DHCP Server IP
     peer port 847;
     max-reponse-delay 60;
     max-unacked-updates 10;
     load balance max seconds 5;
 }

A few things to note…

mclt: This should be defined only on the primary DHCP server. Also known as the maximum client lead time (hence mclt), this declaration indicates the number of seconds a recovering primary must wait after it has received its peer’s lease database, before it can assume the primary role and begin processing DHCP packets. Simply, a shorter time will make failover quicker, but will result in higher loads on the server. Conversely, a longer time will make failover slower (thus more noticeable), but have lower load on the server. The trick is finding the balance between DHCP lease renewal time, failover time, and server load. For reference, on a ~6000 node network, I have this set at 1800 seconds (30 minutes).

split: Again, this should be defined only on the primary DHCP server. This defines a load balancing split between two peers. Basically, it can be a value between 0 and 256. A value of 256 means no load balancing, so the primary server would handle all DHCP requests, and the secondary would only handle DHCP requests if the primary becomes unavailable. A setting of 128 means a 50/50 load balance split between the primary/secondary DHCP servers, and the other will pick up the slack in the event one becomes unavailable. For reference, I have my DHCP servers set to 128 for 50/50 load balancing between the two servers..

At this point, if you restart dhcpd, it will log a bunch of error messages, as the servers know about each other, but do not know what subnets they to do failover for. To do that, we need to edit the subnet declarations in /etc/dhcpd.conf.

Setup Failover on Subnets

Now that the DHCP servers will know about each other, we need to tell them which subnets they will be failover for. To do that, go to the subnet declarations in the /etc/dhcpd.conf file on both servers. If a subnet is handing out IP addresses from a pool, you can the following bit of configuration, the pool section…

failover peer “dhcp-failover”;

Note: The “dhcp-failover” declaration is from earlier in the dhcpd.conf configuration.

To see this in action, see the subnets in the configuration below. Subnet 1 is not handing out a range IP addresses from a pool (these are statically or manually assigned). However, Subnet2 is dynamically assinging a range of IP address from a pool (10.0.2.1 – 10.0.3.250), so we want to have DHCP failover on this subnet.

################## [ SUBNETS ] ##################
# Subnet01 - A set of IP address that are manually or statically assigned
subnet 10.0.1.0 netmask 255.255.255.0 {
     option broadcast-address 10.0.1.255;
     option subnet-mask 255.255.255.0;
     option routers 10.0.1.254;
}
# Subnet02 - A set of IP addresses are dynamically or statically assigned
subnet 10.0.2.0 netmask 255.255.254.0 {
     option broadcast-address 10.0.3.255;
     option subnet-mask 255.255.254.0;
     option routers 10.0.3.254;
     pool { 
          failover peer "dhcp-failover";
          range 10.0.2.1 10.0.3.250; 
     }
}

Note: The DHCP subnets you want to have failover on, NEED to be declared on both servers. If not, dhcpd will either fail to start, or repeatedly log a errors.

Restart dhcpd and Test

Now that both ISC DHCP servers know about each other, and we have declared they will do failover for Subnet02, restart dhcpd.

On ISC DHCP 4.4.2, you will see log messages indicating when the servers are now in failover communication with each other. If communications fail, or are re-established, those will be logged as well. You can test failover by disabling DHCP, or unplugging/disabling the network connection, on one of the DHCP servers, and verifying you can still get a DHCP address. If you configured things correctly, things will just continue working.

That’s it…ISC DHCP should now be set for failover.

Other Documentation

ISC also has a nice KB article on this topic, A Basic Guide to Configuring DHCP Failover. Though it’s light on specifics, it will help you get DHCP failover up and running.