Site Downtime November 20, 2013

drmike · Nov 20, 2013

Looks like filtering from CNServers went down/issues to BuyVM in Vegas.

What's the official word?

MannDude · Nov 20, 2013

No idea. Waiting to hear back from BuyVM. The site was never down for -me-, so I wasn't impacted. Seems to be regional or networking related for their filtered IPs.

Anyhow, I switched filtering to x4b so it appears it's working.

Damian · Nov 20, 2013

Is the site going to be faster!?

HalfEatenPie · Nov 20, 2013

Damian said:
Is the site going to be faster!?

Not for you! We placed a special rule in our filters for you!

I'm kidding. I wish I knew what was going on for you!

MannDude · Nov 20, 2013

Damian said:
Is the site going to be faster!?

Was it slow before? Is it faster now?

Aldryic C'boas · Nov 20, 2013

I'm still waiting to hear back myself. Overnight tech contacted CNServers, they haven't replied yet. I just put in my own ticket and unzipped.

The Filtering itself seems to be working fine. Routing is the issue. From what I'm seeing/hearing on IRC, anyone coming in over nlayer are the ones affected. I'll update again soon as I know more.

MannDude · Nov 20, 2013

Let us know what you find out Aldryic.

InertiaNetworks-John · Nov 20, 2013

It works at my house (att), but I find myself having to VPN to my house from work to get on here (time warner).

drmike · Nov 20, 2013

Some debugging showed OpenDNS was still showing old DNS provider and pushing queries there.

Yeah, DNS provider changed with the outage this morning as well. So the records with the old provider - Rage4 have been updated too.

Amitz · Nov 20, 2013

The BuyVM.net website is completely down for me...

and: "Hey, it's not just you! The URL-address http://buyvm.net looks down from here."

drmike · Nov 20, 2013

Amitz said:
The BuyVM.net website is completely down for me.

Yes sir, their site is down. They are debugging the issue with their provider for the filtering.

Amitz · Nov 20, 2013

At the same time, it is fully reachable through my Prometeus VPN in Milan.

Aldryic C'boas · Nov 20, 2013

Well, that was fun.

Turns out, CNServers (and our tunnel setup) was just fine. The issue was (of course) HE.

Fran noticed that when he shoved outbound traffic back through CNServers, everything started working fine again (though this couldn't be a permanent solution, as it put a ton of strain on CNS). So we got ahold of FiberHub, and were informed of the following:

HE.net enabled RPF on our port last night due to a large attack originating from our network using spoofed IP's that I wasn't able to track down - I didn't realize it would impact you. If you can send me the prefixes that you are sending over CNServers, I'll have HE.net add exceptions for them while we sort out the rest of this mess.

So, tl;dr - HE screwed up our routing. FiberHub contacted them directly with the ranges we need exempted from their BS, and at this point we're just waiting on HE to get that in place so we'll be back to normal again.

Aldryic C'boas · Nov 20, 2013

It's also worth mentioning that due to the filtering fiasco, Stallion is currently unable to contact the Jersey nodes. So anyone with service at our Choopa deployment will be stuck at 'Getting Status...' on their VM page. The nodes and VMs are fine, no worries there; everything will return to normal once this gets sorted out.

Francisco · Nov 20, 2013

The issue is HE related:

Hello,

HE.net enabled RPF on our port last night due to a large attack originating from our network using spoofed IP's that I wasn't able to track down - I didn't realize it would impact you. If you can send me the prefixes that you are sending over CNServers, I'll have HE.net add exceptions for them while we sort out the rest of this mess.

--

Rob Tyree

Fiberhub Colocation & Internet Services

And a quote from IRC to cut the tension:

[08:52] <DaIRC42327> welp, waiting on HE at this point

[08:52] <DaIRC42327> should be fast i hope

[08:52] <&Aldryic> HE? Fast?

[08:53] <&Aldryic> You're being optimistic again, boss.

[08:53] <lbft_> ((((((((

[08:53] <DaIRC42327> i told Rob to offer them a pound of weed

[08:53] <DaIRC42327> in exchange for a fast turn around

[08:53] <lbft_> if we're relying on HE we're all doomed

[08:53] <DaIRC42327> being HE they'll hacky sack that shit into action

[08:53] <&Aldryic> hah

[08:53] <DaIRC42327> Aldryic you missed out man

[08:53] <DaIRC42327> every single HE worker is straight hippy

[08:53] <DaIRC42327> 'dude...like..ipv6 has so many addresses'

[08:53] <The_Hatta> how -- how would that not affect you >_>

[08:54] <DaIRC42327> 'like, 1 for every atom in the world'

[08:54] <lbft_> free love and free ipv6 tunnels

[08:54] <DaIRC42327> anyways this explains things

[08:54] <DaIRC42327> i set a source route

[08:54] <&Aldryic> Yeah... probably for the best that I never meet those folks <_<

[08:54] <&Aldryic> It would not end well.

[08:54] <DaIRC42327> forcing everything back over CN

[08:54] <DaIRC42327> but CN hates when we do that

[08:54] <&Aldryic> lol

[08:54] <DaIRC42327> that's why there was the big burst of working traffic

[08:54] <DaIRC42327> then it exploded into a big flaming ball of fran

[08:55] <The_Hatta> quote of the day\

Francisco

drmike · Nov 20, 2013

Aldryic C said:
So, tl;dr - HE screwed up our routing. FiberHub contacted them directly with the ranges we need exempted from their BS, and at this point we're just waiting on HE to get that in place so we'll be back to normal again.

So how doesn't something like this happen in other datacenters?

I'll raise my hand again for recommending BuyVM at least moves their website + other critical operations stuff outside of the Vegas facility.

Francisco · Nov 20, 2013

drmike said:
So how doesn't something like this happen in other datacenters?

I'll raise my hand again for recommending BuyVM at least moves their website + other critical operations stuff outside of the Vegas facility.

That wouldn't change much. The problem is because we don't force outbound traffic over CN so we 'spoof' the traffic. It's really the only option given how much transit we push over the filtering ranges.

Francisco

Aldryic C'boas · Nov 20, 2013

drmike said:
So how doesn't something like this happen in other datacenters?

I'll raise my hand again for recommending BuyVM at least moves their website + other critical operations stuff outside of the Vegas facility.

We would need filtering wherever we put it. Which means this situation could have just as easily been replicated somewhere else at it was at FH.

There are also other points to consider... for starters, we would _never_ offload our panels to another host. That simply wont happen. We also learned the hard way (with CC) what happens when you cannot trust your own DC. I cannot think of anyplace offhand I would trust our hardware in more than FiberHub; and I sure as hell wont risk our clients’ info in someone else’s hands.

drmike · Nov 20, 2013

Aldryic C said:
We would need filtering wherever we put it. Which means this situation could have just as easily been replicated somewhere else at it was at FH.

There are also other points to consider... for starters, we would _never_ offload our panels to another host. That simply wont happen. We also learned the hard way (with CC) what happens when you cannot trust your own DC. I cannot think of anyplace offhand I would trust our hardware in more than FiberHub; and I sure as hell wont risk our clients’ info in someone else’s hands.

I am sympathetic, truly.

If not wanting to move things outside of Vegas network, then perhaps redundancy for it over in Jersey?

When these issues happen, regardless of cause, the panel goes offline, the website, and other reference resources people go to check (those of us who haven't dutifully bookmarked all the shortcuts and ripped all the info to local collection).

splitice · Nov 20, 2013

Francisco said:
That wouldn't change much. The problem is because we don't force outbound traffic over CN so we 'spoof' the traffic. It's really the only option given how much transit we push over the filtering ranges.

Francisco

Sorry boss.

Site Downtime November 20, 2013

100% Tier-1 Gogent

Just a dude

New Member

The Irrational One

Just a dude

The Pony

Just a dude

Inertia Networks, LLC

100% Tier-1 Gogent

New Member

100% Tier-1 Gogent

New Member

The Pony

The Pony

Company Lube

100% Tier-1 Gogent

Company Lube

The Pony

100% Tier-1 Gogent

Just a little bit crazy...