Running your own failover DNS setup

wlanboy

Content Contributer
It is easy to run your website on different servers.

It is easy to run databases accross different servers.

Because virtual servers got quite cheap.

But it is not that easy to ensure that your visitors are visiting a server that is still running.

You can setup more than one A record but a lot of browsers do not support DNS load balancing.

You can setup a load balancer but then you just moved the problem to another place. You load balancer is now the single point of failure.

For me DNS looks like a good solution to point visitors to the right server.

During the next weeks I will add more complex scenarios on how to handle DNS failover.

But I want to start with a low effort and simple solution.

Afterwards more tools and servers are added to the setup.

So everyone can decide how much effort he/she wants to put into his/her own DNS failover system.

So lets start with the first step into DNS failover.

1. Create a DNS server account which is supporting dynamic DNS updates
For me HE.NET is offering a cheap ($0.00) and reliable DNS service.

If you add an A record you can select that this record can be dynamically updated through a script:

dynamicdns1.jpg

TTL (time to live) for this can be setup to up to 5 minutes.
Quite a short amount of time for a free service.

dynamicdns2.jpg

After the creation you have to click on the arrows on the right side to add your access key.

dynamicdns3.jpg

This will be your password to update the A record. [The values are not real - so don't try them.]

Best addon it is working for AAAA (IPv6!) too:

dynamicdns4.jpg

The update of the ip is simple:


curl "https://dyn.dns.he.net/nic/update?hostname=dyn.example.com&password=password&myip=192.168.0.1"
curl "https://dyn.dns.he.net/nic/update?hostname=dyn.example.com&password=password&myip=2001:db8:beef:cafe::1"

Just use curl to call a url.

2. Write a short bash script that is managing everything

So what do we need?

  • A textfile containing ip addresses of the web servers
  • A way to check which servers are online
  • A call to HE.NET to update the DNS A record

I am using just bash, curl and dig.

Dig ist part of the dnsutils and can be installed through following command:


sudo apt-get install dnsutils

After that we can create the file containing the ips:


nano ~/ips

Content:


127.0.0.1
186.0.0.1
10.1.1.1

So one ip per line. I am using the order to priorize the servers because the script is taking the first usable ip to update the DNS record.

Now we can create the bash file:


nano ~/dnsupdate && chmod +x ~/dnsupdate

Content:


#!/bin/bash
IFS=$'\n' read -d '' -r -a ips < ~/ips
statusweb=()
index=0
echo "=================================="
echo "check following ips"
echo "${ips[@]}"

for i in ${ips[@]}
do
echo "=================================="
echo "checking $i"
let index=index+1
if curl -m 5 -s -k --head --request GET $i | grep "200 OK" > /dev/null
then
statusweb[index]=true
echo "================="
echo "web ip is up"
echo "================="
else
statusweb[index]=false
echo "================="
echo "web ip is down"
echo "================="
fi
done

echo "=================================="
echo " "
echo "update dns"
echo " "
index=0
for statuswebval in ${statusweb[@]}
do
if $statuswebval
then
echo "=================================="
echo "Changing web DNS..."
oldip=$(dig +short test.domain.com)
echo "current ip: ${oldip}"
echo "new ip: ${ips[$index]}"
echo "=================================="
if [ "${ips[$index]}"=="$oldip" ]
then
         echo "================="
         echo "update not needed"
         echo "================="
else
         curl "https://dyn.dns.he.net/nic/update?hostname=test.domain.com&password=astromgpassword&myip=${ips[$index]}"
         echo "================="
         echo "update done"
         echo "================="
fi
break
else
echo "================="
echo "Skipping ${ips[$index]}"
echo "================="
fi
let index=index+1
done

echo "=================================="
echo "end of script"
echo "=================================="

So what is this script doing?

  • read the list of ips into an array (ips)
  • create two empty arrays (statusweb,index)
  • echo the list of ips
  • loop through the ips
    for each ip do
    add 1 to index (let is cool)
  • curl the http header of the webservice running on the ip and check if it is 200
    Timeout is set to 5 seconds to ensure the script is not locked.
  • curl is returning true or false so it can be part of an if statement
  • we store the status of the ip with a true or false value in the array statusweb

[*]loop through the status values
  • for each status do
    dig the DNS record you want to update
    +short ensures that only the ip is returned
  • compare the ip of the DNS record with the first ip which is working
  • update the DNS record or skip it

[*]done
Last step is creating a cron job calling this script every 5 minutes:


crontab -e

Add line:


/5 * * * * /usr/bash ~/dnsupdate

I think this is the bare minimum setup to check webservers and update DNS records.

So let's talk about some disadvantages:

  • Webservers are only checked from one internet connection
    So you cannot be sure if the server is offline for the all visitors or only for you
  • There is no history record for the reliability of one ip
    So you cannot be sure that you select an ip that is currently up but does only have an update of 80%
    You can try to manage that by sorting the list of ips but you have to keep the records for that too.
  • If you use more than one vps to run this script  it might happen that the different scripts will overwrite the results of other scripts.
    So if you have a network split or more than one routing issue the DNS record is flipping around.

As always I am looking for feedback, improvements and other solutions.

Next step is to add CloudFlare support.

First thing you need is the API-KEY, which can be found on the buttom of your Account information.

cloudflare1.jpg

The API itself is easy, but is using JSON.

So we need some Ruby magic to get this done.


nano dnsupdate.rb

Content:


require 'json'
domain = ARGV[0]
ip = ARGV[1]
id = ""
listResponse = `curl [parameters 1]`
puts listResponse

domains = JSON.parse(listResponse)
domains['response']['recs']['objs'].each do | domainrecord |
puts domainrecord
if (domain == domainrecord['name'])
id = domainrecord['rec_id']
break
end
end

updateResponse = `curl [parameters 2]`
status = JSON.parse(updateResponse)
puts status

if status['result'] == 'success'
puts "update done"
else
puts "error during update of #{domain}"
end

So what are we doing?

  • save the two parameters to the vars domain and ip
  • Call curl to get the list of domains and DNS records
  • Pars the JSON response to find the correct record for the given domains
    In this example I want to update the A record for the domain itself
  • If the record is found save the id of the record (needed for update)
  • Send the update request via curl
  • Check the status to ensure that the update is done
We now take a look at the two curl calls:

  • list domains and records

    curl https://www.cloudflare.com/api_json.html \
    -d 'a=rec_load_all' \
    -d 'tkn=[Your API_TOKEN]' \
    -d 'email=[Your CloudFlare login]' \
    -d 'z=[domain to update]'

  • update domain
    Code:
    curl https://www.cloudflare.com/api_json.html \
      -d 'a=rec_edit' \
      -d 'tkn=[Your API_TOKEN]' \
      -d 'id=[DB ID of the record you want to update]' \
      -d 'email=[Your CloudFlare login]' \
      -d 'z=[Domain of record]' \
      -d 'type=A' \
      -d 'name=[Name of record to update]' \
      -d 'content=[new ip address]' \
      -d 'service_mode=1' \
      -d 'ttl=1' \
    TTL is the time to live of record in seconds. 1 is the value for the "Automatic" setting.
If you have got a nodeping account you can use the provided results as the source of ping numbers.

You have to set the results to "public access" to enable the script to download the ping results without an API key.

This is my Ruby script that catches the ping results from nodeping, checks the number of network failures, and sets the ip which is a) currently online and B) does have the lowest number of failures as the new value of the A record of the given domain:


require 'json'
require 'date'

class NodePingResult
attr_accessor :ip, :isup, :numberOfBadResults

def to_s
"#{@ip} #{@isup} #{@numberOfBadResults}"
end
end

nodepingReports = []
nodepingIPs = []
nodePingResults = []

recordId = ""
ip = ""

#######################################################
#Please change you cloudflare and nodeping information
#######################################################
domain = 'mydomain'
cloudflaretoken='QWERTZUIOP'
cloudflarelogin='[email protected]'
##################################
nodepingReports << 'https://nodeping.com/reports/results/[reportid]/50?format=json'
nodepingIPs << '127.0.0.1'
nodepingReports << 'https://nodeping.com/reports/results/[reportid]/50?format=json'
nodepingIPs << '127.0.0.1'
#######################################################

nodepingIPs = nodepingIPs.reverse

counter = 0
nodepingReports.reverse_each do | report |

res = NodePingResult.new
res.ip = nodepingIPs[counter]
res.numberOfBadResults = 0

reportResult = `curl #{report}`
results = JSON.parse(reportResult)
results.each do | result |
if ('Success' == result['m'])
res.isup = true
else
res.isup = false
res.numberOfBadResults += 1
end
end
counter += 1
nodePingResults << res
end

nodePingResults.sort! { |a,b| a.numberOfBadResults <=> b.numberOfBadResults }
nodePingResults.each do | newip |
if (newip.isup == true)
ip = newip.ip
break
end
end
puts "selected ip: #{ip}"

parameterDomainList = "-d 'tkn=#{cloudflaretoken}' -d 'email=#{cloudflarelogin}' -d 'z=#{domain}'"
listResponse = `curl https://www.cloudflare.com/api_json.html -d 'a=rec_load_all' #{parameterDomainList}`
#puts listResponse

domains = JSON.parse(listResponse)
domains['response']['recs']['objs'].each do | domainrecord |
puts domainrecord
if (domain == domainrecord['name'])
recordId = domainrecord['rec_id']
break
end
end
puts recordId

parameterDomainUpdate = "-d 'tkn=#{cloudflaretoken}' -d 'id=#{recordId}' -d 'email=#{cloudflarelogin}' -d 'z=#{domain}' -d 'type=A' -d 'name=#{domain}' -d 'content=#{ip}' -d 'service_mode=1' -d 'ttl=1'"
updateResponse = `curl https://www.cloudflare.com/api_json.html -d 'a=rec_edit' #{parameterDomainUpdate}`
status = JSON.parse(updateResponse)
#puts status

if status['result'] == 'success'
puts "update done: #{domain} now pointing to #{ip}"
else
puts "error - check last response: #{status['msg']}"
end

The script has to loop through the ping results in reverse order because the list starts with the newest entry first.

Due to the lack of API access you have to enter the ip address of each nodeping test.

And for the people who don't want to use Ruby - the bash only version of the script:

1. Create list of ips to check


nano ~/ips

Content:


127.0.0.1
186.0.0.1
10.1.1.1

2. Create list of node ping tests for the given ips (same order)


nano ~/results

Content:


https://nodeping.com/reports/results/[id of test]/100?format=json
https://nodeping.com/reports/results/[id of test]/100?format=json
https://nodeping.com/reports/results/[id of test]/100?format=json

3. Install libs:


sudo apt-get install dnsutils curl jq

If jq is not in the repos you can install it yourself:


#32bit version
wget http://stedolan.github.io/jq/download/linux32/jq && chmod +x jq && cp jq /usr/bin
#64bit version
wget http://stedolan.github.io/jq/download/linux64/jq && chmod +x jq && cp jq /usr/bin

4. Create bash script:


nano dnsupdate && chmod +x dnsupdate

Content:


#!/bin/bash
###########################################
#Configuration
###########################################
domain="domain.com"
recordId="#"
cloudflarelogin="cloudflare-login"
cloudflaretoken="cloudflare-token"
###########################################
#Files
###########################################
IFS=$'\n' read -d '' -r -a iplist < ~/ips
IFS=$'\n' read -d '' -r -a results < ~/results
###########################################
statusweb=()
statuspoints=()
index=0
echo "=================================="
echo "check following ip list with nodeping"
echo "${iplist[@]}"

for i in ${iplist[@]}
do
echo "=================================="
echo "checking $i"
$(curl -m 5 ${results[index]} -o "./res${i}")
resultstring=$(cat "./res${i}")
statuspoints[index]=$(cat "./res${i}" | grep -Po '"m":.*?[^\\]",' | grep Success | wc -l)
resultstring=$(cat "./res${i}" | grep -Po '"m":.*?[^\\]",' | head -n 1)
if [ "${resultstring}"=="\"m\":\"Success\"," ]
then
statusweb[index]=1
echo "================="
echo "web ip is up"
echo "================="
else
statusweb[index]=0
echo "================="
echo "web ip is down"
echo "================="
fi
echo " "
echo "status: ${statusweb[index]}"
echo "status: ${statuspoints[index]}"
echo " "
let index=index+1
done

max=0
counter=0
indexselectedip=0

for point in ${statuspoints[@]}; do
if (( point > max && statusweb[counter] == 1 )); then
max=$point
indexselectedip=$counter
fi
let counter=counter+1
done

parameterDomainList="-d a=rec_load_all -d tkn=${cloudflaretoken} -d email=${cloudflarelogin} -d z=${domain}"
domaindata=$(curl https://www.cloudflare.com/api_json.html ${parameterDomainList} -o domainlist)
key=$(cat domainlist | jq '.response.recs.objs[] | {name, rec_id} ' | cut -d ':' -f 2 | grep \" | sed 's/"//g' | sed 's/ //g' | sed 's/,//g' | tr '\n' ' ' )
domainlist=( $key )
echo "checking domainlist: ${domainlist[@]} with cloudflare"
for ((i=1; i < ${#domainlist}; i++))
do
echo "#${domainlist[$i]} ${i}"
if [[ "${domainlist[$i]}" == "${domain}" ]]
then
let k=i-1
recordId="${domainlist[$k]}"
echo "break for: ${recordId}"
break
fi
let i=i+1
done

echo " "
echo "=================================="
echo " "
echo "update dns"
echo " "
echo "=================================="
echo "Changing web DNS..."
oldip=$(dig +short ${domain} | head -n 1)
sleep 2
echo "current ip: ${oldip}"
echo "new ip: ${iplist[$indexselectedip]} (points: ${statuspoints[$indexselectedip]})"
tester=${iplist[indexselectedip]}
echo "=================================="
if [[ "${tester}" == "${oldip}" ]]
then
echo "================="
echo "update not needed"
echo "================="
else
parameterDomainUpdate="-d act=rec_edit -d a=rec_edit -d tkn=${cloudflaretoken} -d id=${recordId} -d email=${cloudflarelogin} -d z=${domain} -d type=A -d name=${domain} -d content=${iplist[indexselectedip]} -d service_mode=1 -d ttl=1"
cloudflareresponse=$(curl https://www.cloudflare.com/api_json.html ${parameterDomainUpdate})
echo "response: ${cloudflareresponse}"
echo "================="
echo "update done"
echo "================="
fi
echo "=================================="
echo "end of script"
echo "=================================="

So what are we doing here?

  • Load list of ips and nodeping tests
  • Check results for each ip
    count number of good results to check quality of host
  • check current status of ip
[*]Sort ips by status and uptime
[*]Load list of DNS records for domain from cloudflare
[*]Search for domain record we want to update
[*]dig domain to get current ip
[*]compare current ip with the one with the best uptime and score
  • update dns record
  • or do nothing if the record is allready pointing to the best ip
 
Last edited by a moderator:

5n1p

New Member
I have been looking for something like this, and also use dns.he.net, but have not thought of 'dyn', great stuff thank you :)
 

Jack

Active Member
@wlanboy you interested in doing a guide for Cloudflare? Also maybe integrating nodeping (Can provide sub-account) for the failover bit?
 
Last edited by a moderator:

wlanboy

Content Contributer
@wlanboy you interested in doing a guide for Cloudflare? Also maybe integrating nodeping (Can provide sub-account) for the failover bit?
Yup - cloudflare is interesting and nodeping too.

Write a PM with the nodeping login and I will test the setup with one of my free .ml domains.
 

wlanboy

Content Contributer
Surely this can be used with Cloudflare as well?
I have updated the tutorial to include the Cloudflare API.

Wanted to add the NodePing APi too but:

A token is required to use the API and must be included with each API request.


The token for your account is listed in the Account Settings section of of your account.

SubAccounts do not have token access.
 

peterw

New Member
Great script. I was looking for that solution and now I can do it on my own. And I want, like Jack, to use nodeping as a uptime source too.
 

wlanboy

Content Contributer
@wlanboy you interested in doing a guide for Cloudflare? Also maybe integrating nodeping (Can provide sub-account) for the failover bit?
Great script. I was looking for that solution and now I can do it on my own. And I want, like Jack, to use nodeping as a uptime source too.
Added the cloudflare + nodeping solution at the end of the first post.
 

peterw

New Member
Great. Thank you for providing a solution. It is running fine on my vps. I catched one difference. You reverse both arrays so I have to put my preferred ip at the end of the list. Can someone translate this to php?
 

wlanboy

Content Contributer
found this little gem on how to parse JSON results in bash.
Thank you for pointing me to jq.

It was a pain to parse json with awk.

So my bash (+jq) only version of the "check nodeping status and then update cloudflare dns record" script is at the end of my first post.
 

mikho

Not to be taken seriously, ever!
Thank you for pointing me to jq.


It was a pain to parse json with awk.


So my bash (+jq) only version of the "check nodeping status and then update cloudflare dns record" script is at the end of my first post.
Before I read the end of your first post I will say that that part will probably be what I use for my failover LES boxes.


Thanks!
 

wlanboy

Content Contributer
Post is quite too long to read the last solution so this is the latest one:

And for the people who don't want to use Ruby - the bash only version of the script:

1. Create list of ips to check


nano ~/ips

Content:


127.0.0.1
186.0.0.1
10.1.1.1

2. Create list of node ping tests for the given ips (same order as ips)


nano ~/results

Content:


https://nodeping.com/reports/results/[id of test]/100?format=json
https://nodeping.com/reports/results/[id of test]/100?format=json
https://nodeping.com/reports/results/[id of test]/100?format=json

3. Install libs:


sudo apt-get install dnsutils curl jq

If jq is not in the repos you can install it yourself:


#32bit version
wget http://stedolan.github.io/jq/download/linux32/jq && chmod +x jq && cp jq /usr/bin
#64bit version
wget http://stedolan.github.io/jq/download/linux64/jq && chmod +x jq && cp jq /usr/bin

4. Create bash script:


nano dnsupdate && chmod +x dnsupdate

Content:

Code:
#!/bin/bash
###########################################
#Configuration
###########################################
domain="domain.com"
recordId="#"
cloudflarelogin="cloudflare-login"
cloudflaretoken="cloudflare-token"
###########################################
#Files
###########################################
IFS=$'\n' read -d '' -r -a iplist < ~/ips
IFS=$'\n' read -d '' -r -a results < ~/results
###########################################
statusweb=()
statuspoints=()
index=0
echo "=================================="
echo "check following ip list with nodeping"
echo "${iplist[@]}"

for i in ${iplist[@]}
do
    echo "=================================="
	echo "checking $i" 
	$(curl -m 5 ${results[index]} -o "./res${i}")
	resultstring=$(cat "./res${i}")
	statuspoints[index]=$(cat "./res${i}" | grep -Po '"m":.*?[^\\]",' | grep Success | wc -l)
	resultstring=$(cat "./res${i}" | grep -Po '"m":.*?[^\\]",' | head -n 1)
	if [ "${resultstring}"=="\"m\":\"Success\"," ]
	then
		statusweb[index]=1
		  echo "================="
		  echo "web ip is up"
		  echo "================="		
	else
	    statusweb[index]=0
		  echo "================="
		  echo "web ip is down"
		  echo "================="	
	fi
	echo " "
	echo "status: ${statusweb[index]}"
	echo "status: ${statuspoints[index]}"
	echo " "
	let index=index+1
done

max=0
counter=0
indexselectedip=0

for point in ${statuspoints[@]}; do
    if (( point > max && statusweb[counter] == 1 )); then 
		max=$point
		indexselectedip=$counter
	fi
	let counter=counter+1
done

parameterDomainList="-d a=rec_load_all -d tkn=${cloudflaretoken} -d email=${cloudflarelogin} -d z=${domain}"
domaindata=$(curl https://www.cloudflare.com/api_json.html ${parameterDomainList} -o domainlist)
key=$(cat domainlist | jq '.response.recs.objs[] | {name, rec_id} ' | cut -d ':' -f 2 | grep \" | sed 's/"//g' | sed 's/ //g' | sed 's/,//g' | tr '\n' ' ' )
domainlist=( $key )
echo "checking domainlist: ${domainlist[@]} with cloudflare"
for ((i=1; i < ${#domainlist}; i++))
do
    echo "#${domainlist[$i]} ${i}"
	if [[ "${domainlist[$i]}" == "${domain}" ]]
	then
	    let k=i-1
	    recordId="${domainlist[$k]}"
		echo "break for: ${recordId}"
		break
	fi
	let i=i+1
done

echo " "
echo "=================================="
echo " "
echo "update dns"
echo " "
		echo "=================================="
		echo "Changing web DNS..."
		oldip=$(dig +short ${domain} | head -n 1)
		sleep 2
		echo "current ip: ${oldip}"
		echo "new     ip: ${iplist[$indexselectedip]} (points: ${statuspoints[$indexselectedip]})"
		tester=${iplist[indexselectedip]}
		echo "=================================="
		if [[ "${tester}" == "${oldip}" ]]
		then
		  echo "================="
		  echo "update not needed"
		  echo "================="
		else
		  parameterDomainUpdate="-d act=rec_edit -d a=rec_edit -d tkn=${cloudflaretoken} -d id=${recordId} -d email=${cloudflarelogin} -d z=${domain} -d type=A -d name=${domain} -d content=${iplist[indexselectedip]} -d service_mode=1 -d ttl=1"
		  cloudflareresponse=$(curl https://www.cloudflare.com/api_json.html ${parameterDomainUpdate})
		  echo "response: ${cloudflareresponse}"
		  echo "================="
		  echo "update done"
		  echo "================="		  
		fi
echo "=================================="
echo "end of script"
echo "=================================="
 

bdtech

New Member
Nice work! Your going to want to run the curl checks at least every minute via a cron job. Also keep in mind your dig's can be cached so it can get a bit more complex. I do things slightly different in bash and only call an API when action is needed (ie server goes down)


Cloudflare has many advantages:


CF for DNS only (grey cloud): You can set a low 2min TTL


CF for DNS + CF Reverse Proxy (orange cloud): Don't have to worry about TTLs expiring and DNS servers not respecting low TTLs. Rather instant propagation internally through the CF network


You can also effectively load balance behind CF


Anycast which HE doesn't support


Disadvantage: the API is down on occasion for usually short periods of time.


CF is the new SPOF for the web


Tip: you can use Status Cake's Ping URL to trigger failovers, or Uptime Robots RSS Feed, or even better Route 53's health check
 
Last edited by a moderator:
Top