Arachni RPC running with many bugs.
Hi,
Currently using arachni-1.5-0.5.11.
We have been using the grid mode with the balanced options but we can never get it running even somewhat stable. A lot of the first issues were around RAM issues etc but all these are fixed.
So our current setup is: 5x Servers with 16 gb ram each and a master with also 16 gb ram.
We are using the arachni_rpc in order to initiate the scans.
From arachni master:
ruby arachni_rpcd --address 10.20.50.8 --external-address 10.20.50.8 --port 7331 --port-range 17331-27331 --nickname arachni-master --pool-size 1 --pipe-id "Pipe 8" --weight 1000 --reroute-to-logfile
From dispatcher1:
ruby arachni_rpcd --address 10.20.50.10 --external-address 10.20.50.10 --port 7331 --port-range 17331-27331 --nickname arachni-dispatcher-xx1 --pool-size 10 --pipe-id="Pipe 10" --reroute-to-logfile --neighbour arachni-master:7331
From dispatcher2:
ruby arachni_rpcd --address 10.20.50.11 --external-address 10.20.50.11 --port 7331 --port-range 17331-27331 --nickname arachni-dispatcher-xx2 --pool-size 10 --pipe-id="Pipe 11" --reroute-to-logfile --neighbour arachni-master:7331
One thing i noticed is that we have all our neighbour's as the same. aracnhi_master. Would this create a problem ?
The command we run from the arachni_master
sudo /opt/arachni/current/bin/arachni_rpc --dispatcher-url=10.20.50.10:7331 --grid --spawns=1 --browser-cluster-ignore-images --scope-auto-redundant=4 --report-save-path=/opt/arachni/reports/testsite.afr --timeout 48:00:00
Is there anything wrong in our setup ? All boxes are fully updated ubuntu.
Comments are currently closed for this discussion. You can start a new one.
Keyboard shortcuts
Generic
? | Show this help |
---|---|
ESC | Blurs the current field |
Comment Form
r | Focus the comment reply box |
---|---|
^ + ↩ | Submit the comment |
You can use Command ⌘
instead of Control ^
on Mac
1 Posted by Kevin on 19 Jun, 2017 10:31 AM
One thing i might note that is not clear:
One of the problems can be the arachni_rpc stops working and everytime you try to start it, it will just hang without doing any actual work.
Other times it seems as the connection is lost and arachni_rpc client is unable to get any reports. I can see that the scans are still running through arachni_rpcd_monitor.
Support Staff 2 Posted by Tasos Laskos on 19 Jun, 2017 03:39 PM
The way you've setup the grid no other Dispatcher than
arachni-master
will be used due to the high weight you've assigned to it.Also, try removing the
--spawns
option, it's unstable and will be removed.Support Staff 3 Posted by Tasos Laskos on 19 Jun, 2017 04:38 PM
My bad, got it backwards,
arachni-master
will never be used.4 Posted by Kevin on 19 Jun, 2017 09:30 PM
Thanks. Will try without the --spawns. It just stated that i had to specify it.
In regards to the arachni-master thing. We did that specifically in order to not get a heavy load on the arachni-master as it is managing the scans.
Support Staff 5 Posted by Tasos Laskos on 19 Jun, 2017 10:07 PM
Don't specify
--grid
either so that you won't have to specify--spawns
, the Dispatchers will load balance the scans amongst themselves still.Also, arachni-master isn't managing anything, no one node is more important that the others. Whichever one you ask it'll search the Grid for the one with the lowest workload score, ask it for an Instance and then pass that information back to you.
6 Posted by Kevin on 22 Jun, 2017 02:28 PM
Still very unstable. It is like the arachni_rpc is loosing it's connection to the dispatcher(After pressing ctrl+c it will hang forever)
And then some of the new scans started with rpc will not start.
I don't know if it is caused by ram consumption but some of our servers will also just become completely unresponsive in some cases. I just think it's hard to control ram when it is balancing a lot of scans by itself.
Do you have any ideas or any debug info that i could provide that would be helpfull ?
Support Staff 7 Posted by Tasos Laskos on 22 Jun, 2017 02:31 PM
I think you should run fewer scans, sounds like the servers are having a pretty hard time.
Out of curiosity, how many scans are you running on these machines?
8 Posted by Kevin on 23 Jun, 2017 08:01 AM
Maybe one or 2 scans each. So around 5-10 scans for 5 servers with good specs.
9 Posted by Kevin on 23 Jun, 2017 08:24 AM
Also when using rpcd monitor i can see a scan is running but the arachni_rpc is dead.
Can i stop the scanning on the rpcd on a dispatcher without restarting the whole thing ?
10 Posted by Kevin on 23 Jun, 2017 08:28 AM
Also noticed that the timeout feature is not working. If i state 48 hours it does not help and the scan just continues
Support Staff 11 Posted by Tasos Laskos on 23 Jun, 2017 08:35 AM
12 Posted by Kevin on 23 Jun, 2017 12:09 PM
That is good to hear. Around 60% to 80%. I would not consider it as lagging but more as the client unable to contact the dispatcher. Also some of the boxes are completely dead meaning i can't even SSH into them. In google cloud console they show ~0% cpu at that time though.
Good to know thanks.
I understand. It is hard for me to believe that it is a network issue though, given that it is build in google cloud and the servers are next to eachother..
Support Staff 13 Posted by Tasos Laskos on 23 Jun, 2017 12:17 PM
The boxes being completely dead is worrisome., can you perform an identical scan and periodically check the amount of running processes and disk usage in addition to RAM and CPU?
Theoretically there could be a bug in the way browsers are spawned, leading to basically a fork bomb or the tmp files Arachni creates to offload workload to disk could be taking up all the space.
14 Posted by Kevin on 23 Jun, 2017 12:23 PM
Just to make a quick note. By completely dead i mean hard reset is the only way.
I have just started ~20 scans and will monitor the RAM + CPU consumption + amount of running processes.
The tmp files taking all the space is definately a worthy shot. I have 4 scans running for 10-20 minutes on one of the machines and 1.7gigs left of diskspace. Will it exceed that ?
Also thanks for the quick replys. They are greatly appreciated.
Support Staff 15 Posted by Tasos Laskos on 23 Jun, 2017 12:29 PM
Yeah tmp files can easily exceed 1.7GB.
Recommended system requirements state 10GB of available disk space and that's per scan -- that's on the very generous side I'll grant you, but still.
There are cases where disk space can grow even past that and that's a sign of trouble, but it can be mitigated via configuration. We'll cross that bridge when we come to it though.
16 Posted by Kevin on 23 Jun, 2017 12:34 PM
So for now it would be okay to up the servers to 40gigs og disk space each in order to run 4 scans per server ?
Support Staff 17 Posted by Tasos Laskos on 23 Jun, 2017 12:35 PM
Yep, give that a shot and see if it makes a difference.
18 Posted by Kevin on 26 Jun, 2017 09:08 AM
Tried with increasing all the disks to 40 gigs. Ran 5 scans each per server with 4 cores each. I am now currently unable to contact any of the 5 servers here monday morning.
As they are in google cloud i cannot currently see their disk or ram usage but i'm just assuming that disk errors are the problem.
Do you think i pressed them too much with a total of 30 scans?
Support Staff 19 Posted by Tasos Laskos on 26 Jun, 2017 09:15 AM
Yeah, better stick with one scan per core.
Also, while the scans are running can you try
watch -n1 df
andwatch -n1 free
over SSH? At the point where it gets stuck we'll know how things look resource-wise.20 Posted by Kevin on 26 Jun, 2017 10:56 AM
total used free shared buff/cache available
Mem: 15400392 14751468 518416 8768 130508 398496
Swap: 0 0 0
Managed to SSH into one of them again, so they are not completely dead. Looks like memory is close to zero though.
Will try to monitor disk and memory with only 1 scan per core.
I can see that arachni_rpc produces no output anymore but on the arachni-dispatcher it is still scanning. Can i get the reports somewhere when it finishes or are these lost?
21 Posted by Kevin on 27 Jun, 2017 08:34 AM
Hi again,
Currently scanning only 3 applications per dispatcher.
Have set up a 5 gig swapfile so we are currently on 21 gig ram.
But i then got some error logs returned and grepped after memory
10.20.50.10_20992.error.log:[2017-06-27 03:51:39 +0000] [Errno::ENOMEM] Cannot allocate memory - /opt/arachni/arachni-1.5-0.5.11/system/usr/bin/ruby
10.20.50.15_18939.error.log:[2017-06-27 08:16:21 +0000] [Errno::ENOMEM] Cannot allocate memory - /opt/arachni/arachni-1.5-0.5.11/system/usr/bin/ruby
10.20.50.15_23423.error.log:[2017-06-27 06:27:49 +0000] [Errno::ENOMEM] Cannot allocate memory - /opt/arachni/arachni-1.5-0.5.11/system/usr/bin/ruby
Could mem leaks be the error ?
This is our current scan:
arachni_rpc --dispatcher-url=10.20.50.8:7331 --browser-cluster-ignore-images --scope-auto-redundant=4 --timeout=48:00:00 --report-save-path=/opt/arachni/reports/uuid.afr --http-request-queue-size=50 --browser-cluster-pool-size=4 --checks=,-common_,-backup_*,-backdoors
Support Staff 22 Posted by Tasos Laskos on 27 Jun, 2017 02:05 PM
That's a lot of RAM, there could be a leak in the scanner or it could be that one of the scans just needs a lot of memory, it depends on the web application.
23 Posted by Kevin on 28 Jun, 2017 11:39 AM
So if there is a leak in the scanner how do i fix it ?
Support Staff 24 Posted by Tasos Laskos on 28 Jun, 2017 02:41 PM
You can try playing with the
--scope
options, especially the--scope-dom
ones.Unfortunately as far as the scanner is concerned I've gotten Arachni as far as I can take it, which is why I've been working on a new engine which will solve these kinds of issues.
And since you're using the Grid, one new feature of the new engine that was added after I made the blog post, is a much smarter Grid that is aware of available system resources and automatically calculates the amount of scans that can be safely performed in parallel so it won't let you shoot yourself in the foot.
Also, a new queue system has been implemented to which you can post as many scan jobs as you wish and it will safely distribute and manage them for you.
Unfortunately, I haven't got an ETA for the new engine, it will probably be a while before a beta is available.
Until then, try experimenting with the available options and have a look at this article: http://support.arachni-scanner.com/kb/general-use/optimizing-for-fa...
25 Posted by Kevin on 29 Jun, 2017 08:33 AM
Okay fair enough.
I will be trying the --scope-dom ones. A bit clueless as i have no idea how many js events are needed in --scope-dom-event-limit
26 Posted by Kevin on 29 Jun, 2017 08:58 AM
Also one last thing:
I asked before but no answer: I can see that arachni_rpc produces no output anymore but on the arachni-dispatcher it is still scanning. I can see that through arachni_rpcd_monitor. Can i get the reports somewhere when it finishes or are these lost if arachni_rpc cannot gather the result ?
Best regards,
Kevin
27 Posted by Kevin on 29 Jun, 2017 09:31 AM
And are there any point in using the 2.0 development or nightlies or is that far fetched to fix the problem ?
28 Posted by Kevin on 29 Jun, 2017 02:24 PM
Memory output
8698 root 20 0 6741740 2.655g 26148 S 0.3 18.1 21:12.44 phantomjs
5767 root 20 0 4594416 1.360g 9504 S 19.3 9.3 8:21.87 phantomjs
6202 root 20 0 4621088 1.253g 10932 S 19.6 8.5 8:02.36 phantomjs
10137 root 20 0 3673920 777200 9428 S 20.3 5.0 4:53.24 phantomjs
9885 root 20 0 3578212 728336 10660 R 18.9 4.7 5:08.96 phan
29 Posted by Kevin on 29 Jun, 2017 10:08 PM
FIXED
For anyone else wondering what the issue was, it was phantomjs. The setting "ignore images" should not be used.
The bug is years old and described in https://github.com/ariya/phantomjs/issues/12903
Should be noted in the docs that the "ignore images" contains a bug. Anyway thanks for all the other suggestions Tasos.
Support Staff 30 Posted by Tasos Laskos on 01 Jul, 2017 11:19 AM
Glad you identified the issue, I hadn't heard of this before. I may need to disable this option in Arachni.
Tasos Laskos closed this discussion on 01 Jul, 2017 11:19 AM.