# Internet routing table breaks 512,000 routes



## Francisco (Aug 13, 2014)

Looks like with all the mass IP grabs as well as subletting of space, things have finally hit the 512K mark.

Anyone on the raw end of an upstream router that ran out of route space?

How do most routers handle route exhaustion? Does it still operate, just with a < full table? Or does it crap out? Or does it end up doing a lot of the routing in software? I've honestly

not worried about it too much since we always pick DC's based on their in house networks, not how

carrier neutral they are.

Francisco


----------



## sundaymouse (Aug 13, 2014)

http://www.reddit.com/r/talesfromtechsupport/comments/2deb04/today_was_a_bad_day_and_every_isp_knew_it_was/


----------



## SGC-Hosting (Aug 13, 2014)

We do IT work on Long Island (new york) near where we're based. Last week we had setup a wireless network for a company and they have their radius server in another country. Yesterday morning I get calls about it not working... They were able to reach the radius server... but the radius server wasn't able to reach the domain controllers for authentication!  Went nuts trying to figure it out.  Ended up setting up a separate network with temporary credentials so everyone can get online.


----------



## SkylarM (Aug 13, 2014)

"Don't fix it till it's broke"

Anyone still think the IPv6 transition will happen anytime soon? Because this is just another prime example that it WON'T happen anytime soon


----------



## concerto49 (Aug 13, 2014)

No issue here. Didn't want to take the risk so got a more expensive router with more routes in case.


----------



## wlanboy (Aug 13, 2014)

sundaymouse said:


> http://www.reddit.com/r/talesfromtechsupport/comments/2deb04/today_was_a_bad_day_and_every_isp_knew_it_was/


For the lazy ones:



> Bit more technical than usual, and not directly about my job, so consider this an informational tale first and foremost. If you wonder why your internet has been acting up in North America in the last 24 hours, here's a graphical explanation.
> 
> The Border Gateway Protocol entries in the Forwarding Information Base finally hit 512K. Yes, 512K... That's tiny, and the fact this caused major network issues in north America shows the fragility of the network caused by overuse of legacy hardware and software too many people never bothered to upgrade or maintain, even when the issues they would cause could be predicted a decade ahead of time. In short, it was an artificial problem because a bunch of people waited till it was broke to preempt the problem.
> 
> ...


----------



## Francisco (Aug 13, 2014)

concerto49 said:


> No issue here. Didn't want to take the risk so got a more expensive router with more routes in case.


Makes the most sense.

I'm assuming people with 512K TCAM's will likely just do partial routes + default.

Francisco


----------



## Mun (Aug 13, 2014)

I think it happened around 8:06 PST


----------



## concerto49 (Aug 13, 2014)

Mun said:


> I think it happened around 8:06 PST


Think we had some abuse with the Fliphost server that's been taken care of. Definitely wasn't related. Not sure about the other stuff there.


----------



## nunim (Aug 13, 2014)

Personally I think this is awesome, companies are going to have to replace some of their dated equipment which means increased IPv6 support (hopefully..)!


----------



## concerto49 (Aug 13, 2014)

nunim said:


> Personally I think this is awesome, companies are going to have to replace some of their dated equipment which means increased IPv6 support (hopefully..)!


Most likely routes will be consolidated / filtered.


----------



## Deleted (Aug 13, 2014)

Most ciscos will flap because memory is exhausted. Only solution is to stop accepting /24's, which is what it should have been all along.


----------



## Wintereise (Aug 13, 2014)

Francisco said:


> Makes the most sense.
> 
> 
> I'm assuming people with 512K TCAM's will likely just do partial routes + default.
> ...


Filter at a /22 and you'll live another few years. TCAMs can also be adjusted on a few models that are 'virtually' limited to 512k.

But for hardware that genuinely can't address more than 512k, yeah -- bad time to be a netops guy.


----------



## dcdan (Aug 13, 2014)

Got this yesterday:



> An emergency maintenance window on the Phoenix NAP network will occur on Wednesday, August 13, 2014, from 2:00 a.m. to 4:00 a.m. Mountain Standard Time (MST).  During this maintenance window, the networking team will upgrade our core devices to allow for more routes. This change will require a reload of each core device that may cause intermittent network communication of roughly 5- 10 minutes during this time window.
> 
> Our Network Engineering Team will be taking all necessary precautions to mitigate any prolonged connectivity issues during the maintenance window. If you have any questions, please contact NOC Services.
> 
> Thank you for your understanding and patience as we continue to work toward providing you with the best possible service.


 
...  few hours later:



> As we continue to see problems in our IP services network we will be pushing
> this maintenance to 11:00PM MST. Our network engineers are making the final
> preparations to take on this maintenance and minimize any potential downtime. If
> you have any questions or concerns please feel free to contact us.


----------



## splitice (Aug 13, 2014)

nginx.com and nginx.net had routing issues yesterday to "some US subnets". Hosted by (M5 Computer Security). Probably related.


----------



## VPSCorey (Aug 13, 2014)

FRH stayed up 

Most routers with these limits crapped their pants until the TCAM was adjusted and rebooted.


----------



## Wintereise (Aug 13, 2014)

In all honesty, the problem isn't really 'fixable' by upgrading the limit as you go.

You can get away with filtering at /24s now and not care -- but that isn't going to be the case pretty soon, since partial allocations seem to *really* be on the horizon. What happens when (god forbid) we're forced to accept announces for /29s?

Vendors would probably have done better to invent something that could grow with the network, rather than plastering bandaid after bandaid over an already bleeding wound.


----------



## concerto49 (Aug 14, 2014)

Wintereise said:


> In all honesty, the problem isn't really 'fixable' by upgrading the limit as you go.
> 
> 
> You can get away with filtering at /24s now and not care -- but that isn't going to be the case pretty soon, since partial allocations seem to *really* be on the horizon. What happens when (god forbid) we're forced to accept announces for /29s?
> ...


/29? How many LoAs would I have to write? Evil.


----------



## Francisco (Aug 14, 2014)

concerto49 said:


> /29? How many LoAs would I have to write? Evil.


There's a vote at ARIN about it 

Francisco


----------



## splitice (Aug 14, 2014)

How would routes of size /24 be routed if routers without support where to filter them? Do they just get a default route to one upstream carrier and go from there?


----------



## trewq (Aug 14, 2014)

splitice said:


> How would routes of size /24 be routed if routers without support where to filter them? Do they just get a default route to one upstream carrier and go from there?


I'm guessing it follows the route of the larger block until it gets to a hop with the correct route and goes from there.


----------



## VPSCorey (Aug 14, 2014)

No vote required.  Nobody will be allowed to announce anything smaller than a /24 period.  Some upstreams will accept /25's but only local to the AS it gets filtered upstream.

As it stands there are too many people de aggregating to /24's as space gets more fragmented for various reasons.

The DRFZ Default Routing Free Zone aka the internet if properly aggregated would just have about 288k routes in the table and a lot of us would be happy with that.   Algorithms have been run to see which providers are the worst de-aggregate offenders , think the top one is in brazil with over 3000 prefixes being announced, when about 128 if it was aggregated.

You can see stats about these things @ http://www.cidr-report.org/2.0

As has been discussed on NANOG and other mailing lists is that if you were in the market today you would want a router that supported 2M routes, upgradable to 4M for the future because if the growth curve continues were looking at 1.25M routes in about 4 years.

Most providers can get away with TCAM adjustments to 768K routes because they need to leave room for IPv6 growth, but that is only expected to last a year or so.


----------



## Wintereise (Aug 14, 2014)

splitice said:


> How would routes of size /24 be routed if routers without support where to filter them? Do they just get a default route to one upstream carrier and go from there?


People usually set a defroute to one of their carriers to find 'stray prefixes' and route them through to that carrier.

What to do with that traffic then becomes the carrier's responsibility, but thankfully most core routers should have no issue dealing with any of this.

> No vote required.  Nobody will be allowed to announce anything smaller than a /24 period.  Some upstreams will accept /25's but only local to the AS it gets filtered upstream.

This is the ideal situation *now*, I'm really curious what happens when obtaining further allocations by any means whatsoever becomes impossible (So, say, 2-3 years down the road, I guess for all RIRs?)

As Francisco mentioned, there's a vote on it on ARIN PPML and desperate times are known to call for desperate methods.


----------



## Francisco (Aug 14, 2014)

splitice said:


> How would routes of size /24 be routed if routers without support where to filter them? Do they just get a default route to one upstream carrier and go from there?


I'm assuming if their equipment supports multi-path they, tehy'd get a default route from each upstream and then it'd round-robin it. If not, yes, they'd favor one path for a default route.

Interesting article:

http://www.bgpmon.net/what-caused-todays-internet-hiccup/

From the looks of it, it isn't like everyone went and filtered their routes or upgraded and we're over that hump. It looks like we're due for another episode like this once things grow to 512k naturally.

I'm surprised to see that liquidweb got pants on this too.

Francisco


----------



## Deleted (Aug 14, 2014)

Filtering /24 via prefix list would cause that specific route to not be installed so a more generic route (default route, or larger route will be use) , remember the specifics usually win, then if not, the default route is used or the larger allocation. 

Considering these 2 situations:

Customer A is announcing 1.1.1.0/24 

Customer A's ISP has 1.1.0.0/18 (or whatever)

Since Customer A is announcing 1.1.1.0/24 that is a more specific match and it will be used for transit.

But if you filter /24's via prefix list, the following will happen:

1.1.1.1/24 <-- filtered, so we go to the next 'larger' allocation. 

1.1.0.0/18 <-- this will be be shown only in the routing table, and it will be used for transit. 

EDIT: It amazes me there are providers out there that use a /24 and are single homed. They should use static routes or iBGP and not announce their prefixes out to the internet. Use a private AS so you can manipulate your bgp communities or whatever.


----------



## FHN-Eric (Aug 18, 2014)

Only issue is, many ISPs don't support ipv6 yet. All of them should have ipv6 support by now, but not all of them do.


----------



## Francisco (Aug 19, 2014)

FHN-Eric said:


> Only issue is, many ISPs don't support ipv6 yet. All of them should have ipv6 support by now, but not all of them do.


Given many users on the Cisco 6500's are likely going to chop away at how much IPV6 table space they have just to make way for V4 space, don't get your hopes up.

I just think a lot of people in a position of "we must upgrade", will simply go the default route option or route filtering instead of upgrading. A lot of those people could very well be on equipment that either doesn't support IPV6 or does IPV6 in software making it useless.

Francisco


----------



## Kris (Aug 23, 2014)

FHN-Eric said:


> Only issue is, many ISPs don't support ipv6 yet. All of them should have ipv6 support by now, but not all of them do.


+1 on what Francisco said.

With older 6500 series and ISPs re-allocating TCAM to keep up with v4, hard limiting v6 routes = no v6. '

Especially once you can't fit the full v6 routing table once it's widely deployed.


----------



## Francisco (Aug 23, 2014)

Kris said:


> +1 on what Francisco said.
> 
> With older 6500 series and ISPs re-allocating TCAM to keep up with v4, hard limiting v6 routes = no v6. '
> 
> Especially once you can't fit the full v6 routing table once it's widely deployed.


The 6500's are work horses so yeah, I don't think people are going to rush to replace them unless they really want a full IPV6 + IPV4 table.

It's the usual catch 22. Companies aren't bothering to setup IPV6 because there's no demand. There's not a lot of demand because companies aren't bothering to set it up.

While ARIN & co are predicting 50% of all wired devices to be IPV6 capable by *2018* (https://twitter.com/TeamARIN/status/479358717126000641), that doesn't mean networks will be wired for it.

There's way too many cases on WHT lately of people getting their only /22's from RIPE and then instantly trying to sell it for a little side cash. As I and many others have been saying, expect the routing table to get huuuuge in the coming few years and more and more blocks are carved.

Francisco


----------



## Magiobiwan (Aug 23, 2014)

As I recall, IPv6 requires significantly less TCAM space for its routes than IPv4 did. 6500's with slightly less IPv6 TCAM can still hold out for some time as IPv6 begins to grow in adoption. But yes, itis a band-aid solution for the issue at hand.


----------

