amuck-landowner

Nodewatch Settings

Tyler

Active Member
What settings are you currently using for Nodewatch? What do you do to minimize the chance of false VPS suspensions?

I'm also thinking about writing a module for Nodewatch that will email a customer (using their SolusVM registered client email) when their VPS has been suspended, and it will also cite the reason for the suspension. If we choose to go through with it, we'll open source it too :)
 
Last edited by a moderator:

Bruce

New Member
Verified Provider
good question. there's a fine line between using a VPS for DDoS and running a legitimate benchmark, installing software, updates, etc.

for me, the fun one is


UNREPLIED conntrack sessions (DoS attack/OUT)

default setting is


Suspend VPS if it spawns over 20000 unreplied conntrack sessions
as > 60,000 is easy to achieve, the default is far too low. I either whitelist the VPS for conntrack, or set the threshold to 75,000. I'm not convinced about that being right. right now I'm monitoring alert emails, and then manually suspending if needed. seems better than auto-suspend, and a personally written ticket asking why seems more friendly than an auto-generated one.

nodewatch is useful, but not ideal. if it was unencoded you could tweak the php. anyone experimented with changing the cron from 5min to say 10 or 15 ? usually software updates/installs or benchmarks won't last too long. 
 

Tyler

Active Member
good question. there's a fine line between using a VPS for DDoS and running a legitimate benchmark, installing software, updates, etc.
Thanks for the reply and details. The conntrack session was also causing us some issues, so I'm thinking about setting it to 75,000 and seeing what happens.

Most suspensions are for conntrack counts anyway, so I am thinking our limit is likely too low. It's also not a big issue for us since our DC's will rate limit or null the IP if there's an outgoing DDoS (not saying that's the best way, but it is a second layer of protection)
 

dcdan

New Member
Verified Provider
Last edited by a moderator:

HN-Matt

New Member
Verified Provider
I've had to whitelist vps on more than one occasion because NW would suspend for excessive conntrack sessions before OVH's DDoS protection could kick in, defeating the purpose of it.

On the other hand, its 'limit_smtp_suspend' setting may be useful to any VPS provider with the misfortune of living on planet Earth.
 
Last edited by a moderator:

Husky

Verified Dog
Verified Provider
I kicked nodewatch to the kerb and wrote my own scripts. I found nodewatch would continually crash and then sit at 100% CPU for a php process. Screw that.
 

Bruce

New Member
Verified Provider
I kicked nodewatch to the kerb and wrote my own scripts. I found nodewatch would continually crash and then sit at 100% CPU for a php process. Screw that.
I don't suppose it's open source? or available for a price?
 

dcdan

New Member
Verified Provider
I kicked nodewatch to the kerb and wrote my own scripts. I found nodewatch would continually crash and then sit at 100% CPU for a php process. Screw that.
I do not remember seeing any support tickets related to that. Although I do see how this might be possible on a misconfigured system. On nodes with 500 containers Nodewatch uses under 10% of a single core when everything is configured properly.
 

KuJoe

Well-Known Member
Verified Provider
I can attest that nodewatch has never been an issue resource wise and we have a in-house script that monitors and reports resource usage along with process names and never once has nodewatch's processes been at the top of the list except for on an idle server.
 

Husky

Verified Dog
Verified Provider
We ran it across several servers over the course of a few months. Even hooked it into the web panel thing that was provided.

Every node would randomly stop reporting and nodewatch wouldn't die without a kill -9 and would be sitting at 100% of a core

Even with stock configuration. Don't know what to tell you other than that. Couldn't be bothered to look into fixing it and wrote my own solution for what I needed.

EDIT: When it's operating properly, it works fine. Low resource usage etc. It was just every once in a while it's almost as if it got stuck in an infinite loop somewhere and just hung with full core usage.
 
Last edited by a moderator:

Geek

Technolojesus
Verified Provider
We ran it across several servers over the course of a few months. Even hooked it into the web panel thing that was provided.

Every node would randomly stop reporting and nodewatch wouldn't die without a kill -9 and would be sitting at 100% of a core

Even with stock configuration. Don't know what to tell you other than that. Couldn't be bothered to look into fixing it and wrote my own solution for what I needed.

EDIT: When it's operating properly, it works fine. Low resource usage etc. It was just every once in a while it's almost as if it got stuck in an infinite loop somewhere and just hung with full core usage.
I had that happen once.  Dunno if it's related or not, but it happened to be the one machine I had running with options nf_conntrack ip_conntrack_disable_ve0=1 but once I changed that to re-enable conntracks on the node again it was fine.  Doesn't seem like the two would be related in any way but meh...it worked that time.  I can also attest to the occasional server seeing more resource utilization than I'd prefer, but those were on a couple of E3s I used to have at work.
 
Last edited by a moderator:
Top
amuck-landowner