Arachni seems hanging
I am running the latest version (Arachni v1.4 - WebUI v.0.5.10).
The pages discovered is 2364; the requests performed and the responses received are both 2047725. The running time is more than 39 hours and it is still running. However, Arachni seems hanging without further requests and response. The other information is standing still without changing. It is showing "scanning" too.
I do not want to quit it as it is running over 39 hours. Any idea?
Showing page 2 out of 2. View the first page
Comments are currently closed for this discussion. You can start a new one.
Keyboard shortcuts
Generic
? | Show this help |
---|---|
ESC | Blurs the current field |
Comment Form
r | Focus the comment reply box |
---|---|
^ + ↩ | Submit the comment |
You can use Command ⌘
instead of Control ^
on Mac
Support Staff 31 Posted by Tasos Laskos on 03 Apr, 2016 02:09 PM
2GB of RAM isn't actually all that much for that site. Just the size of the string objects involved can eat it up, some of the pages are quite large and when you process them that size gets multiplied by a lot.
About the distributed stuff, it doesn't distribute the crawl, only the audit so what you're seeing is expected.
Since I fixed this I decided to also profile the system with that site, hoping that there would be a memory leak I can fix to reduce the RAM but no dice, so far everything is operating properly.
Btw, v1.3.2 did have leaks which were fixed in v1.4, so you can't really compare the 2, the browser parts of the system were seriously overhauled.
Support Staff 32 Posted by Tasos Laskos on 03 Apr, 2016 02:47 PM
Fixed: https://github.com/Arachni/arachni/commit/8a9c8bbc36367e0a461cc9a04...
Will let you know once the nightlies are up.
33 Posted by Dave on 03 Apr, 2016 03:55 PM
Heeyy... nice job, I'm looking forward to try it out.
Is there any way I can limit the ram consumption on the scan server ?
Maybe reduce --browser-cluster-pool-size or --browser-cluster-worker-time-to-live
Or even better: make it fail in a graceful non-fatal way ?
I'll get back you asap with the results of the test of most recent nightly.
Best Regards
Dave
Support Staff 34 Posted by Tasos Laskos on 03 Apr, 2016 04:44 PM
Nightlies are up, let me know how they work.
35 Posted by Dave on 03 Apr, 2016 04:55 PM
Cool, I'm starting a scan right away.
36 Posted by Dave on 05 Apr, 2016 07:06 PM
ok, I've run tests with both short scans (minutes) and long scans (several hours), but I'm not seeing significant differences in the number of timed out requests.
Each scan still ends with a lot (thousands) of queued requests and most of them times out.
I'll let you know if I find something interesting, but for now, it looks like the problem is still here.
Support Staff 37 Posted by Tasos Laskos on 06 Apr, 2016 09:30 AM
You mean browser jobs right?
The requests don't actually time out.
38 Posted by Dave on 06 Apr, 2016 11:03 AM
Yes, browser jobs.
I'll start a full scan with the nightly later today to get the final data.
Support Staff 39 Posted by Tasos Laskos on 06 Apr, 2016 01:09 PM
Hm, what's your browser job timeout set at?
40 Posted by Dave on 06 Apr, 2016 02:51 PM
The browser job timeout is set to 30 sec.
Support Staff 41 Posted by Tasos Laskos on 06 Apr, 2016 07:43 PM
You should set it to something like 120, I'm not sure what the right setting is for your application but it can't hurt.
I was running a scan for quite some time and only got a handful of timed out jobs with the above.
The odd time out now and then is to be expected, but now you should see the jobs being processed much more reliably.
42 Posted by Dave on 06 Apr, 2016 07:59 PM
Ok, I'll try that as well.
Just out of curiosity: 120 sec is a long time for loading and processing a web page, what is Arachni doing with all that time ?
About the memory consumption: I'm running with --http-max-response-size=500000 (as per default) with --browser-cluster-pool-size=4, that uses a about 1100 MB including the OS.
If double the --http-max-response-size and half the --browser-cluster-pool-size the ruby maxes out the memory every time on my 2GB virtual machine, and eventually the process dies before it finishes.
How can it be that increasing the --http-max-response-size has such a dramatic effect ?
And can something be done to avoid it ?
Best Regards
Support Staff 43 Posted by Tasos Laskos on 06 Apr, 2016 08:13 PM
The system deals with page snapshots which include transitions (you see these printed in the output), not just loading a page by its URL -- which it tries in case it works.
These transitions are basically a list of events that need to be triggered on the root page to restore it to the snapshot's state.
It also waits for timers, in the case of your webapp the max timer is 45s which will be capped to the value of the HTTP request timeout option (not that they're related, but the request timeout lets you glean how patient the system should be in general and having a gazillion configurable time outs can get complicated).
Then there's whether or not you're dealing with a cold cache or not and the odd timed-out HTTP request due to server stress or a dropped connection or whatever.
I'll check to verify that there's nothing fishy going on for those few timed-out jobs, but a few failed jobs are to be expected just like a few failed HTTP requests are, regardless of configuration,
Support Staff 44 Posted by Tasos Laskos on 06 Apr, 2016 08:17 PM
About the RAM, that's what the
--http-max-response-size
option is there for, keeping RAM consumption in check for these circumstances.Your site includes large pages with huge lists, ergo huge amount of HTML that needs to be parsed and processed which results in many times more RAM.
I'll try a few optimizations and let you know.
45 Posted by Dave on 06 Apr, 2016 08:32 PM
Ok, thank you, that's good to know.
The large inventory page has almost 2000
<a>
tags, 2/3 of them are ajax that don't generate a new page. I assume that this can make that one page a very big task to process.Also the cart page has some ajax (post action) not nearly as much, but the inputs are named by the product id, which gives about 2000 possible variations to test.
I can see that Arachni tests all of them, would it be possible to have a redundancy limit like --scope-auto-redundant applied to post as well as get ?
That would certainly speed up the scanning.
Thanks again for all your help
Support Staff 46 Posted by Tasos Laskos on 06 Apr, 2016 08:45 PM
--scope-auto-redundant
doesn't apply to GET requests but resource locations regardless of HTTP method. I'm not sure whether or not that helps in your case though, I'll have to see one of the requests in question and I'm not sure which one it is.47 Posted by Dave on 08 Apr, 2016 01:25 PM
Ok, I've got some more test results:
arachni 2.0 nightly --browser-cluster-pool-size='4' --browser-cluster-job-timeout='30' is actually about 30% faster has 30% less timeouts than arachni 1.4 with the same settings, on a full scan. So that is very nice.
arachni 2.0 nightly --browser-cluster-pool-size='4' --browser-cluster-job-timeout='120'
and
arachni 2.0 nightly --browser-cluster-pool-size='4' --browser-cluster-job-timeout='60'
Both ran out of memory before completion. But only colleted 1 time out in comparisson to 5-6000 for the shorter timeout.
Now trying: arachni 2.0 nightly --browser-cluster-pool-size='2' --browser-cluster-job-timeout='60'... will get back with the results.
This is good progress.
If only there was a less than fatal way to handle memory overruns :)
Best regards
Support Staff 48 Posted by Tasos Laskos on 08 Apr, 2016 03:45 PM
The kernel kills the process so there's not much that can be done by Arachni.
The recommended requirements state at least 2GB of available RAM for Arachni and 2GB for the entire OS means you're bellow the threshold.
And the defaults are meant to handle about 95% of cases using those recommended requirements and your case both falls in the 5% and runs on a system with less resources -- regardless of how this goes, you should increase your RAM.
Btw, after a certain point the timeout is meant to act as a last resort when something goes wrong, like in v1.4, so that one bug won't freeze the entire scan.
A large timeout for your case is a good thing, if the system needs more time to do its job you should allow it; sacrificing coverage to save some RAM isn't a good idea.
Support Staff 49 Posted by Tasos Laskos on 11 Apr, 2016 06:56 PM
May I please run a scan with DOM checks? (XSS, unvalidated redirect, etc.)
I did a crawl for a few hours and got no time outs, so I'd like to make sure that jobs from the checks don't time out either.
50 Posted by Dave on 12 Apr, 2016 08:51 AM
Yes of course, you can run a scan.
Thank you for your previous reply.
I understand what you are saying (and mechanics of allocating memory resources), my point was that it's hard to guess how many browser cluster workers I can assign when starting the scan, and given that a full scan can run for days it would be nice to trade performance for completion.
My idea to achieve this was to check available memory prior to the scanning of each new page (or other meaning full interval) making sure that at least X amount of memory is available.
This would effectively make the number of active phantomjs workers dependent on the available memory resources and provide autoscaling in a simple form.
Best Regards
Support Staff 51 Posted by Tasos Laskos on 12 Apr, 2016 10:44 AM
I have thought of that, although I don't remember why I didn't implement it.
It's worth a shot though and could also work for many other parts of the system, if you know that resources are running out you can change a lot of other settings to automatically try and make due with what's available.
I actually have the resource monitoring code available because I was working on something similar for the new WebUI so that's a plus.
You can follow the progress of this feature at: https://github.com/Arachni/arachni/issues/695
It's going to take a while to be implemented though.
Support Staff 52 Posted by Tasos Laskos on 13 Apr, 2016 03:34 PM
Current timed out jobs seem to be due to occasional HTTP time-outs due to server stress.
When running with all checks I get the occasional timed-out job because there are a lot of requests being performed, when running only with DOM checks I get no time-outs after running for a few hours.
I'll perform a few more scans to profile the system, see if I can optimize it.
53 Posted by Dave on 14 Apr, 2016 07:59 AM
It would really be awesome if you can implement this way of handling memory consumption, the scanning process would be much more robust. I'll be looking forward to that.
Should I create a github issue for it ?
So it sounds like the remaining time outs are limited to target server responsiveness, that's good news.
Support Staff 54 Posted by Tasos Laskos on 14 Apr, 2016 08:01 AM
I beat you to it: https://github.com/Arachni/arachni/issues/695
55 Posted by Dave on 14 Apr, 2016 08:06 AM
Perfect :)
Support Staff 56 Posted by Tasos Laskos on 14 Apr, 2016 11:59 AM
You may also want to grab the nightlies, they include some more optimizations.
Tasos Laskos closed this discussion on 14 Apr, 2016 11:59 AM.