# How to find the site that crashes the server?



## Greg (Apr 17, 2015)

Probably 1 out of 100 sites crashes the VPS every few days.
Apache connections go to few hundreds and mysql requests to few millions. That causes the CPU to be at 100%+ permanently and making the server completely unusable.
Restart fixes the situation for few days.
How to find the evil one?

I've god mod_status with extendedstatus ON but still, it just shows the few recent requests. Not sure how to use that to narrow it down among 100 sites.

The rest of the time the server is using 20-30% of the resources so it's not overloaded with sites.


----------



## Jack (Apr 17, 2015)

Is it a cPanel box?


----------



## Greg (Apr 17, 2015)

nope, vestaCP

it does nginx+apache


----------



## zzrok (Apr 17, 2015)

When the server "crashes", what does mod_status say?  Which pages are in the list at that time?  My guess is you will see one script referenced by almost all of the processes.  That is likely the culprit.

Is the server swapping when it "crashes"?  You can usually tell by the amount of time the CPU is in a wait state (%wa in top) and high swap usage.  Unless you have a very large server, I doubt you have the memory to handle several hundred apache connections.


----------



## mojeda (Apr 17, 2015)

I would try seeing if one of the domains has an unusually large error/access log


----------



## Greg (Apr 17, 2015)

zzrok said:


> When the server "crashes", what does mod_status say?  Which pages are in the list at that time?  My guess is you will see one script referenced by almost all of the processes.  That is likely the culprit.
> 
> Is the server swapping when it "crashes"?  You can usually tell by the amount of time the CPU is in a wait state (%wa in top) and high swap usage.  Unless you have a very large server, I doubt you have the memory to handle several hundred apache connections.


Great suggestions. It sill hasn't crashed since I enabled mod_status extended. However, how I could see the last processes when the server is down?

yes, it  swaps a lot when it does it, i even exteded the swap to like 5-6 gigs, it uses it all  and its a ssd swap



mojeda said:


> I would try seeing if one of the domains has an unusually large error/access log


hm, any smart way to see them all at once? checking 100 error/access logs 1 by 1 isn't exactly fun way to spend Friday evening.  Not that I have a life but still....


----------



## mojeda (Apr 17, 2015)

Greg said:


> hm, any smart way to see them all at once? checking 100 error/access logs 1 by 1 isn't exactly fun way to spend Friday evening.  Not that I have a life but still....


find /var/log/apache2/domains -type f -iname "*.log" -printf '%s %p\n'| sort -nr | head -10

That will look in the default log location for vestacp domains. It'll then list the top 10 largest files. It's set to only list files that end with .log (to exclude the rotated logs).

You can change the amount it lists by changing the number at the end *head -10* or just remove *| head -10* at the end to list all files sorting from smallest to largest.

Source: http://www.cyberciti.biz/faq/how-do-i-find-the-largest-filesdirectories-on-a-linuxunixbsd-filesystem/


----------



## iClickAndHost (Apr 19, 2015)

OP, don't forget to post the solution if you manage to solve this. 

Did you notice high memory load ? Can you find a pattern in the mem log ?
Additionally you can run  cat /proc/meminfo and see if you notice something unusual.


----------



## Greg (Apr 20, 2015)

mojeda said:


> find /var/log/apache2/domains -type f -iname "*.log" -printf '%s %p\n'| sort -nr | head -10
> 
> That will look in the default log location for vestacp domains. It'll then list the top 10 largest files. It's set to only list files that end with .log (to exclude the rotated logs).
> 
> ...


This command works like magic! Thank you so much for it. That was exactly the smart way I needed! 



iClickAndHost said:


> OP, don't forget to post the solution if you manage to solve this.
> 
> Did you notice high memory load ? Can you find a pattern in the mem log ?
> 
> ...


Yes, I just confirmed that I've fixed it.

Well it wasn't just one site, it was all of them! I had the _xmlrpc_._php file of all WP sites causing the load. It would take an apache worker and hang on it for like minutes if not hours._

_I was having up to 150-200 apache processes and 90% of them occupied by xmlrpc_._php of different wp sites_

Probably some kind of attack but I'm still reading on that.

One option is that hackers use that to pingback ddos other sites or something

solution was to block it alltogether in nginx and now for over 24 hours the load is fine and nothing crashes.

Thanks for this wonderful community! If it wasn't for you i would be strill struggling!


----------



## libro22 (Apr 20, 2015)

Have you tried checking mysql processes? There may be sleeping queries, most probably caused by locks. This happens a lot with poorly coded scripts - lotsa sql injections.

For wordpress sites, checkout wp-login, xmlrpc and admin-ajax (can't remember the right filename) and wp-cron bruteforce attacks. Most of the time, mod_security can handle them. With a cpanel server, Unixy varnish ratelimit throttling does a good job too.

I don't know if it's compatible, but I do recommend cloudlinux for shared environment like this (only RHEL-based OS though).


----------



## Greg (Apr 21, 2015)

libro22 said:


> Have you tried checking mysql processes? There may be sleeping queries, most probably caused by locks. This happens a lot with poorly coded scripts - lotsa sql injections.
> 
> For wordpress sites, checkout wp-login, xmlrpc and admin-ajax (can't remember the right filename) and wp-cron bruteforce attacks. Most of the time, mod_security can handle them. With a cpanel server, Unixy varnish ratelimit throttling does a good job too.
> 
> I don't know if it's compatible, but I do recommend cloudlinux for shared environment like this (only RHEL-based OS though).


yes I had problems with wp-login before and now it was xmlrpc and now both are blocked at nginx level for the whole server

how does one check the mysql processes

I just learned how to check the apache ones, lol


----------



## TierNet (Apr 21, 2015)

You can check in the server logs or check the highest CPU consuming processes to detect the culprit. On many occasions, customer's website gets hacked, especially the WordPress ones and script starts crashing the server, so you need to find the script and fix it.


----------



## libro22 (Apr 21, 2015)

Greg said:


> yes I had problems with wp-login before and now it was xmlrpc and now both are blocked at nginx level for the whole server
> 
> 
> how does one check the mysql processes
> ...


You can do it via the mysql commandline, or install phpmyadmin.


----------

