# OVH Canada Offline



## drmike (Nov 2, 2015)

OVH is offline in Canada right now.


Someone miles away from the DC cut fiber somehow.


What is alarming here is that fiber goes to Newark, NJ, meaning it appears OVH only has that bandwidth.  No local / regional peering.  Quite strange.


6 strands are down.  OVH CEO said 4 hours.  


It's dark fiber, no clue how big the whole bundle they are in - might be a ton of fibers to cut, prep, polish, test, etc. before they get things patched up.


----------



## HH-Jake (Nov 2, 2015)

Some information can be found here, http://status.ovh.com/?do=details&id=11304


----------



## drmike (Nov 2, 2015)

Little fact correction on strands down... 26.


6x100G20x10G


----------



## Robert (Nov 2, 2015)

Congrats to Oles for entrepreneur of the year


----------



## drmike (Nov 2, 2015)

It appears OVH was doing a network upgrade also...  new bigger fiber was about to be brought online soon.   Think that's with the 100G lines were about.


They've gone quiet on updates.


Someone implicated a nearby bridge project as maybe being the source of the outage.   The bridge has a slew of demo and other stuff going on.. Zoo like any infra project is.  If fiber was cut on that bridge, good luck, working over water, might need to work suspended off the deck or boom arm from below.  


I am blah about fiber hauls over bridges..  Very common, but much crumbling ancient infra that gets pounded.  If bridge, someones ass is grass today.   Cause those are documented runs for certain and usually really visible to workers (aren't buried in soil).


No way this is 4 hour out.  They've been down / blips for 3 hours.  They will probably be back overnight to normal.  Sooner is some bad ass workers.


----------



## HH-Jake (Nov 2, 2015)

Comment by OVH - Monday, 02 November 2015, 21:30PM


We still don't have an ETA regarding the repairs on the existing fiber.

However, it's been confirmed that 2 out the 3 new pairs of the Eastern route (which were being setup) are now usable. We are working with our providers in order to get a full connection as quickly as possible: the only cross-connect linking OVH to our provider (in Montreal) is currently missing.


----------



## drmike (Nov 2, 2015)

Smells like they got caught with a scheduled event on the bridge or related construction plus lighting new fiber... 


Sounds like from that update, they are saying hell with the bridge (assumed) cut.  


Packets are still getting to OVH, but like 40-50% loss.


----------



## HH-Jake (Nov 2, 2015)

Looks like my websites are finally back up!


----------



## drmike (Nov 2, 2015)

Issue with their network is far from resolved I'd guess.  Incidents are still open on that portal.


Couple of spots I checked show packets traveling to OVH now through TorIX (Toronto Internet Exchange).. Strange routes... I expect OVH to have fiber lit to TorIX, I surely would, directly:


 6  eth1-2.edge1.tor1.ca.as5580.net (78.152.34.199)  85.899 ms  67.502 ms  84.160 ms
 7  ovh.ip4.torontointernetxchange.net (206.108.34.189)  70.690 ms  72.053 ms  72.130 ms
 8  nwk-5-a9.nj.us (178.32.135.234)  84.419 ms  84.979 ms  85.306 ms
 9  bhs-g2-a9.qc.ca (192.99.146.101)  93.669 ms  93.327 ms  96.771 ms
10  bhs-3a-a9.qc.ca (198.27.73.94)  98.484 ms  98.150 ms  98.766 ms
 


So that goes into Toronto, into TorIX, then OVH rides it to Newark and all the way up to BHS.   Strange frickin design....


----------



## ModyDev (Nov 2, 2015)

Another OVH Failure , My friends are suffering and posting about it on FB lol


----------



## HostHoney (Nov 2, 2015)

That is lovely, I am truly suprised they do not have more outages. I have not had any issues with the service in their canadian facility I currently have a server in france and have no problems, although am looking to upgrade as it sometimes can lag.


----------



## OSTKCabal (Nov 2, 2015)

You'd think such a large provider, no matter how budget-oriented, would have a higher level of redundancy on primary fiber runs.


I subscribe to the school of thought that a DC with over 800Gbps of capacity and hundreds of thousands of servers shouldn't be brought down by something like this. Diverse fiber paths are fairly important when you're promising a 99.95% network uptime SLA.


----------



## HN-Matt (Nov 3, 2015)

Immense packet loss to the point of unusability for a few hours yesterday. Could not reach ca.ovh.com from certain connections while able to load it slowly (excruciatingly so) from others.


According to StatusCake, no downtime at all for any of my servers there except a single brief moment lasting '00:07:07'.


----------



## drmike (Nov 3, 2015)

OSTKCabal said:


> You'd think such a large provider, no matter how budget-oriented, would have a higher level of redundancy on primary fiber runs.
> 
> 
> I subscribe to the school of thought that a DC with over 800Gbps of capacity and hundreds of thousands of servers shouldn't be brought down by something like this. Diverse fiber paths are fairly important when you're promising a 99.95% network uptime SLA.



I feel the same way about the fiber path issue...  Really unclear how much was and wasn't down.   I saw uptime on something there, but the packet loss and latency was so high that it was unusable.  Technically online, but real world it wasn't.


With OVH's SLA in place, expect claims and nose bleeds.  I await the Act of God exception to claims.  Can't see them going out of pocket for this blooper.


----------



## willie (Nov 3, 2015)

They claim it's resolved now (ticket is closed) and they are working on having more fiber routes out of BHS.  They had something like that happen before and they did honor SLA claims.  I've been resisting temptation all day to buy one of their SSD VPS since I don't need it.


----------



## drmike (Nov 3, 2015)

willie said:


> They claim it's resolved now (ticket is closed) and they are working on having more fiber routes out of BHS.  They had something like that happen before and they did honor SLA claims.  I've been resisting temptation all day to buy one of their SSD VPS since I don't need it.



Unsure about matter closed.. Still showing open over here:http://status.ovh.com/?do=details&id=11304


Technically, it says In Progress, same as it has since it was opened 


Routing has changed since earlier (mind you this is testing from Toronto proper):


 3  ovh.ip4.torontointernetxchange.net (206.108.34.189)  77.229 ms  77.542 ms  77.872 ms
 4  bhs-g2-a9.qc.ca (178.32.135.71)  78.173 ms  78.809 ms  79.113 ms
 5  bhs-3a-a9.qc.ca (198.27.73.94)  79.425 ms  79.745 ms  80.505 ms

Awaiting TorIX to catch fire now    Can see a lot of traffic better routing to Toronto as opposed to Newark.


This is from Chicago:


 5  * * eth1-7.core1.chi1.us.as5580.net (78.152.45.135)  62.422 ms
 6  eth1-2.edge1.tor1.ca.as5580.net (78.152.34.199)  85.348 ms  76.062 ms  75.424 ms
 7  ovh.ip4.torontointernetxchange.net (206.108.34.189)  223.170 ms  223.166 ms  223.198 ms
 8  bhs-g2-a9.qc.ca (178.32.135.71)  83.890 ms  86.239 ms  86.600 ms
 9  bhs-3a-a9.qc.ca (198.27.73.94)  86.963 ms  94.852 ms  95.160 ms


If lots of Chicago traffic gets siphoned via Toronto, could be an uptick in mass at TorIX.  Pretty decent BW growth there since August:


----------



## willie (Nov 3, 2015)

The closed ticket is this one:


http://travaux.ovh.net/?do=details&id=15238


It's in French and was updated more often and had more info than the English one.


----------



## DomainBop (Nov 3, 2015)

> They claim it's resolved now (ticket is closed)



The control panel for their public cloud is at BHS and was inaccessible for almost 16 hours.  Just started working about 1 hour ago (issue http://status.ovh.net/?do=details&id=11310 ).


----------



## OSTKCabal (Nov 3, 2015)

It's not like this is the first instance... a couple months ago, a car struck a telephone pole outside the DC and brought the entire network down.


----------



## drmike (Nov 3, 2015)

OSTKCabal said:


> It's not like this is the first instance... a couple months ago, a car struck a telephone pole outside the DC and brought the entire network down.



Bahaha? Really, this happened?  Link if you have one...


----------



## HN-Matt (Nov 3, 2015)

OSTKCabal said:


> It's not like this is the first instance... a couple months ago, a car struck a telephone pole outside the DC and brought the entire network down.



For as long as I've been with OVH (more than a couple months) there hasn't been any downtime at all until yesterday. Guess the car accident didn't happen in Canada.


----------



## DomainBop (Nov 3, 2015)

drmike said:


> Bahaha? Really, this happened?  Link if you have one...



May, partial outage, 2 of 3 fibers cut when car hit a pole (the initial reports of a rodent doing it proved to be untrue... ): http://status.ovh.net/?do=details&id=9603


When that fiber cut hit they said the new fiber capacity would be online by September



Spoiler



There is currently a very delicate situation at our datacentre in BHS. We have 4 optical fiber cables that connect the datacenter to the world.1) BHS to Montreal via the North. The link is up and we have 3 pairs of optical fiber cables: to Montreal, New York (600G capacity) and to level 3 (which provides 200G capacity to New York)2) 2) BHS to Montreal via the South. The link is not yet up. Construction of the connection is delayed in the Indian reserve. We will have 3 pairs of optical fiber cables by July 20153) BHS to New York via the South. This link is not yet up either. It takes a particularly long time to get approval to perform work on the Canada/USA border: 3 years. The go-live date: September 20154) A backup of the MTL/BHS backup is UP, but with low capacity (10G).


----------



## OSTKCabal (Nov 3, 2015)

DomainBop said:


> May, partial outage, 2 of 3 fibers cut when car hit a pole (the initial reports of a rodent doing it proved to be untrue... ): http://status.ovh.net/?do=details&id=9603
> 
> 
> When that fiber cut hit they said the new fiber capacity would be online by September
> ...



Partial is a relative term. All of the OVH customers I talked to at the time (50+-member Skype chats, forums, IRC) were offline (or at least very highly unstable) for the vast majority of the day.


----------



## HN-Matt (Nov 3, 2015)

I guess it didn't effect every server in BHS, then? I've been there since January of this year with no downtime until yesterday, and apparently even that wasn't very severe. Uptime monitoring shows zero downtime throughout the day except for the aforementioned few minutes, which were the result of a 'http/https' test to a specific site. Continuous pings to the host nodes from 7 different confirmation servers claim 100% uptime. I slept through most of it, but didn't wake up to any complaints. I'm guessing the packet loss, once again, only effected certain/specific locations around the world.



> Thursday, 28 May 2015, 03:30AM
> 
> 
> It's an incident with the cable but not a clear cut. Several fibers are affected but not the whole cable. The problem could be caused by a rodent.



Should have consulted the https://en.wikipedia.org/wiki/Hermitage_cats


----------



## DomainBop (Nov 4, 2015)

RFO email, SLA credits to be announced within 10 days


http://mj.ovh.com/nl2/4umv/xi.html


----------

