amuck-landowner

OVH Canada Offline

drmike

100% Tier-1 Gogent
OVH is offline in Canada right now.


Someone miles away from the DC cut fiber somehow.


What is alarming here is that fiber goes to Newark, NJ, meaning it appears OVH only has that bandwidth.  No local / regional peering.  Quite strange.


6 strands are down.  OVH CEO said 4 hours.  


It's dark fiber, no clue how big the whole bundle they are in - might be a ton of fibers to cut, prep, polish, test, etc. before they get things patched up.
 

drmike

100% Tier-1 Gogent
It appears OVH was doing a network upgrade also...  new bigger fiber was about to be brought online soon.   Think that's with the 100G lines were about.


They've gone quiet on updates.


Someone implicated a nearby bridge project as maybe being the source of the outage.   The bridge has a slew of demo and other stuff going on.. Zoo like any infra project is.  If fiber was cut on that bridge, good luck, working over water, might need to work suspended off the deck or boom arm from below.  


I am blah about fiber hauls over bridges..  Very common, but much crumbling ancient infra that gets pounded.  If bridge, someones ass is grass today.   Cause those are documented runs for certain and usually really visible to workers (aren't buried in soil).


No way this is 4 hour out.  They've been down / blips for 3 hours.  They will probably be back overnight to normal.  Sooner is some bad ass workers.
 

HH-Jake

New Member
Comment by OVH - Monday, 02 November 2015, 21:30PM


We still don't have an ETA regarding the repairs on the existing fiber.

However, it's been confirmed that 2 out the 3 new pairs of the Eastern route (which were being setup) are now usable. We are working with our providers in order to get a full connection as quickly as possible: the only cross-connect linking OVH to our provider (in Montreal) is currently missing.
 

drmike

100% Tier-1 Gogent
Smells like they got caught with a scheduled event on the bridge or related construction plus lighting new fiber... 


Sounds like from that update, they are saying hell with the bridge (assumed) cut.  


Packets are still getting to OVH, but like 40-50% loss.
 

drmike

100% Tier-1 Gogent
Issue with their network is far from resolved I'd guess.  Incidents are still open on that portal.


Couple of spots I checked show packets traveling to OVH now through TorIX (Toronto Internet Exchange).. Strange routes... I expect OVH to have fiber lit to TorIX, I surely would, directly:


 6  eth1-2.edge1.tor1.ca.as5580.net (78.152.34.199)  85.899 ms  67.502 ms  84.160 ms
 7  ovh.ip4.torontointernetxchange.net (206.108.34.189)  70.690 ms  72.053 ms  72.130 ms
 8  nwk-5-a9.nj.us (178.32.135.234)  84.419 ms  84.979 ms  85.306 ms
 9  bhs-g2-a9.qc.ca (192.99.146.101)  93.669 ms  93.327 ms  96.771 ms
10  bhs-3a-a9.qc.ca (198.27.73.94)  98.484 ms  98.150 ms  98.766 ms
 


So that goes into Toronto, into TorIX, then OVH rides it to Newark and all the way up to BHS.   Strange frickin design....
 
Last edited by a moderator:

HostHoney

New Member
That is lovely, I am truly suprised they do not have more outages. I have not had any issues with the service in their canadian facility I currently have a server in france and have no problems, although am looking to upgrade as it sometimes can lag.
 

OSTKCabal

Active Member
Verified Provider
You'd think such a large provider, no matter how budget-oriented, would have a higher level of redundancy on primary fiber runs.


I subscribe to the school of thought that a DC with over 800Gbps of capacity and hundreds of thousands of servers shouldn't be brought down by something like this. Diverse fiber paths are fairly important when you're promising a 99.95% network uptime SLA.
 

HN-Matt

New Member
Verified Provider
Immense packet loss to the point of unusability for a few hours yesterday. Could not reach ca.ovh.com from certain connections while able to load it slowly (excruciatingly so) from others.


According to StatusCake, no downtime at all for any of my servers there except a single brief moment lasting '00:07:07'.
 
Last edited by a moderator:

drmike

100% Tier-1 Gogent
You'd think such a large provider, no matter how budget-oriented, would have a higher level of redundancy on primary fiber runs.


I subscribe to the school of thought that a DC with over 800Gbps of capacity and hundreds of thousands of servers shouldn't be brought down by something like this. Diverse fiber paths are fairly important when you're promising a 99.95% network uptime SLA.

I feel the same way about the fiber path issue...  Really unclear how much was and wasn't down.   I saw uptime on something there, but the packet loss and latency was so high that it was unusable.  Technically online, but real world it wasn't.


With OVH's SLA in place, expect claims and nose bleeds.  I await the Act of God exception to claims.  Can't see them going out of pocket for this blooper.
 

willie

Active Member
They claim it's resolved now (ticket is closed) and they are working on having more fiber routes out of BHS.  They had something like that happen before and they did honor SLA claims.  I've been resisting temptation all day to buy one of their SSD VPS since I don't need it.
 

drmike

100% Tier-1 Gogent
They claim it's resolved now (ticket is closed) and they are working on having more fiber routes out of BHS.  They had something like that happen before and they did honor SLA claims.  I've been resisting temptation all day to buy one of their SSD VPS since I don't need it.

Unsure about matter closed.. Still showing open over here:http://status.ovh.com/?do=details&id=11304


Technically, it says In Progress, same as it has since it was opened :)


Routing has changed since earlier (mind you this is testing from Toronto proper):


 3  ovh.ip4.torontointernetxchange.net (206.108.34.189)  77.229 ms  77.542 ms  77.872 ms
 4  bhs-g2-a9.qc.ca (178.32.135.71)  78.173 ms  78.809 ms  79.113 ms
 5  bhs-3a-a9.qc.ca (198.27.73.94)  79.425 ms  79.745 ms  80.505 ms

Awaiting TorIX to catch fire now :)   Can see a lot of traffic better routing to Toronto as opposed to Newark.


This is from Chicago:


 5  * * eth1-7.core1.chi1.us.as5580.net (78.152.45.135)  62.422 ms
 6  eth1-2.edge1.tor1.ca.as5580.net (78.152.34.199)  85.348 ms  76.062 ms  75.424 ms
 7  ovh.ip4.torontointernetxchange.net (206.108.34.189)  223.170 ms  223.166 ms  223.198 ms
 8  bhs-g2-a9.qc.ca (178.32.135.71)  83.890 ms  86.239 ms  86.600 ms
 9  bhs-3a-a9.qc.ca (198.27.73.94)  86.963 ms  94.852 ms  95.160 ms


If lots of Chicago traffic gets siphoned via Toronto, could be an uptick in mass at TorIX.  Pretty decent BW growth there since August:


torix-year.png
 

OSTKCabal

Active Member
Verified Provider
It's not like this is the first instance... a couple months ago, a car struck a telephone pole outside the DC and brought the entire network down.
 
Top
amuck-landowner