Arachni: Discussion

Arachni seems hanging

2016-04-03T14:09:24Z

2GB of RAM isn't actually all that much for that site. Just the size of the string objects involved can eat it up, some of the pages are quite large and when you process them that size gets multiplied by a lot.

About the distributed stuff, it doesn't distribute the crawl, only the audit so what you're seeing is expected.

Since I fixed this I decided to also profile the system with that site, hoping that there would be a memory leak I can fix to reduce the RAM but no dice, so far everything is operating properly.

Btw, v1.3.2 did have leaks which were fixed in v1.4, so you can't really compare the 2, the browser parts of the system were seriously overhauled.

Arachni seems hanging

2016-04-03T14:47:38Z

Fixed: https://github.com/Arachni/arachni/commit/8a9c8bbc36367e0a461cc9a04...

Will let you know once the nightlies are up.

Arachni seems hanging

2016-04-03T15:55:03Z

Heeyy... nice job, I'm looking forward to try it out.

Is there any way I can limit the ram consumption on the scan server ?
Maybe reduce --browser-cluster-pool-size or --browser-cluster-worker-time-to-live
Or even better: make it fail in a graceful non-fatal way ?

I'll get back you asap with the results of the test of most recent nightly.

Best Regards
Dave

Arachni seems hanging

2016-04-03T16:44:20Z

Nightlies are up, let me know how they work.

Arachni seems hanging

2016-04-03T16:55:44Z

Cool, I'm starting a scan right away.

Arachni seems hanging

2016-04-05T19:06:15Z

ok, I've run tests with both short scans (minutes) and long scans (several hours), but I'm not seeing significant differences in the number of timed out requests.
Each scan still ends with a lot (thousands) of queued requests and most of them times out.

I'll let you know if I find something interesting, but for now, it looks like the problem is still here.

Arachni seems hanging

2016-04-06T09:30:25Z

You mean browser jobs right?
The requests don't actually time out.

Arachni seems hanging

2016-04-06T11:03:28Z

Yes, browser jobs.

I'll start a full scan with the nightly later today to get the final data.

Arachni seems hanging

2016-04-06T13:09:36Z

Hm, what's your browser job timeout set at?

Arachni seems hanging

2016-04-06T14:51:21Z

The browser job timeout is set to 30 sec.

Arachni seems hanging

2016-04-06T19:43:30Z

You should set it to something like 120, I'm not sure what the right setting is for your application but it can't hurt.
I was running a scan for quite some time and only got a handful of timed out jobs with the above.

The odd time out now and then is to be expected, but now you should see the jobs being processed much more reliably.

Arachni seems hanging

2016-04-06T19:59:13Z

Ok, I'll try that as well.
Just out of curiosity: 120 sec is a long time for loading and processing a web page, what is Arachni doing with all that time ?

About the memory consumption: I'm running with --http-max-response-size=500000 (as per default) with --browser-cluster-pool-size=4, that uses a about 1100 MB including the OS.
If double the --http-max-response-size and half the --browser-cluster-pool-size the ruby maxes out the memory every time on my 2GB virtual machine, and eventually the process dies before it finishes.
How can it be that increasing the --http-max-response-size has such a dramatic effect ?
And can something be done to avoid it ?

Best Regards

Arachni seems hanging

2016-04-06T20:13:06Z

The system deals with page snapshots which include transitions (you see these printed in the output), not just loading a page by its URL -- which it tries in case it works.
These transitions are basically a list of events that need to be triggered on the root page to restore it to the snapshot's state.
It also waits for timers, in the case of your webapp the max timer is 45s which will be capped to the value of the HTTP request timeout option (not that they're related, but the request timeout lets you glean how patient the system should be in general and having a gazillion configurable time outs can get complicated).
Then there's whether or not you're dealing with a cold cache or not and the odd timed-out HTTP request due to server stress or a dropped connection or whatever.

I'll check to verify that there's nothing fishy going on for those few timed-out jobs, but a few failed jobs are to be expected just like a few failed HTTP requests are, regardless of configuration,

Arachni seems hanging

2016-04-06T20:17:41Z

About the RAM, that's what the --http-max-response-size option is there for, keeping RAM consumption in check for these circumstances.
Your site includes large pages with huge lists, ergo huge amount of HTML that needs to be parsed and processed which results in many times more RAM.
I'll try a few optimizations and let you know.

Arachni seems hanging

2016-04-06T20:32:35Z

Ok, thank you, that's good to know.

The large inventory page has almost 2000 <a> tags, 2/3 of them are ajax that don't generate a new page. I assume that this can make that one page a very big task to process.

Also the cart page has some ajax (post action) not nearly as much, but the inputs are named by the product id, which gives about 2000 possible variations to test.
I can see that Arachni tests all of them, would it be possible to have a redundancy limit like --scope-auto-redundant applied to post as well as get ?
That would certainly speed up the scanning.

Thanks again for all your help

Arachni seems hanging

2016-04-06T20:45:24Z

--scope-auto-redundant doesn't apply to GET requests but resource locations regardless of HTTP method. I'm not sure whether or not that helps in your case though, I'll have to see one of the requests in question and I'm not sure which one it is.

Arachni seems hanging

2016-04-08T13:25:58Z

Ok, I've got some more test results:
arachni 2.0 nightly --browser-cluster-pool-size='4' --browser-cluster-job-timeout='30' is actually about 30% faster has 30% less timeouts than arachni 1.4 with the same settings, on a full scan. So that is very nice.

arachni 2.0 nightly --browser-cluster-pool-size='4' --browser-cluster-job-timeout='120'
and
arachni 2.0 nightly --browser-cluster-pool-size='4' --browser-cluster-job-timeout='60'
Both ran out of memory before completion. But only colleted 1 time out in comparisson to 5-6000 for the shorter timeout.

Now trying: arachni 2.0 nightly --browser-cluster-pool-size='2' --browser-cluster-job-timeout='60'... will get back with the results.

This is good progress.
If only there was a less than fatal way to handle memory overruns :)

Best regards

Arachni seems hanging

2016-04-08T15:45:04Z

The kernel kills the process so there's not much that can be done by Arachni.
The recommended requirements state at least 2GB of available RAM for Arachni and 2GB for the entire OS means you're bellow the threshold.

And the defaults are meant to handle about 95% of cases using those recommended requirements and your case both falls in the 5% and runs on a system with less resources -- regardless of how this goes, you should increase your RAM.

Btw, after a certain point the timeout is meant to act as a last resort when something goes wrong, like in v1.4, so that one bug won't freeze the entire scan.
A large timeout for your case is a good thing, if the system needs more time to do its job you should allow it; sacrificing coverage to save some RAM isn't a good idea.

Arachni seems hanging

2016-04-11T18:56:11Z

May I please run a scan with DOM checks? (XSS, unvalidated redirect, etc.)
I did a crawl for a few hours and got no time outs, so I'd like to make sure that jobs from the checks don't time out either.

Arachni seems hanging

2016-04-12T08:51:44Z

Yes of course, you can run a scan.

Thank you for your previous reply.
I understand what you are saying (and mechanics of allocating memory resources), my point was that it's hard to guess how many browser cluster workers I can assign when starting the scan, and given that a full scan can run for days it would be nice to trade performance for completion.

My idea to achieve this was to check available memory prior to the scanning of each new page (or other meaning full interval) making sure that at least X amount of memory is available.
This would effectively make the number of active phantomjs workers dependent on the available memory resources and provide autoscaling in a simple form.

Best Regards

Arachni seems hanging

2016-04-12T10:44:52Z

I have thought of that, although I don't remember why I didn't implement it.
It's worth a shot though and could also work for many other parts of the system, if you know that resources are running out you can change a lot of other settings to automatically try and make due with what's available.

I actually have the resource monitoring code available because I was working on something similar for the new WebUI so that's a plus.

You can follow the progress of this feature at: https://github.com/Arachni/arachni/issues/695

It's going to take a while to be implemented though.

Arachni seems hanging

2016-04-13T15:34:20Z

Current timed out jobs seem to be due to occasional HTTP time-outs due to server stress.
When running with all checks I get the occasional timed-out job because there are a lot of requests being performed, when running only with DOM checks I get no time-outs after running for a few hours.

I'll perform a few more scans to profile the system, see if I can optimize it.

Arachni seems hanging

2016-04-14T07:59:11Z

It would really be awesome if you can implement this way of handling memory consumption, the scanning process would be much more robust. I'll be looking forward to that.

Should I create a github issue for it ?

So it sounds like the remaining time outs are limited to target server responsiveness, that's good news.

Arachni seems hanging

2016-04-14T08:01:16Z

I beat you to it: https://github.com/Arachni/arachni/issues/695

Arachni seems hanging

2016-04-14T08:06:56Z

Perfect :)

Arachni seems hanging

2016-04-14T11:59:28Z

You may also want to grab the nightlies, they include some more optimizations.