Can't exclude redundant URLs
I'm having trouble configuring Arachni in a way that performs the crawl but does not run indefinitely. Specifically, our website has a time URL parameter that defines the time range of data that the user is looking at, which is controlled by buttons or a calendar picker in the UI. Arachni appears to try to iterate over all possible values for the time parameter such as in this set of listed pages from the scan result:
[+] https://<website-name>/#overview:time=25079980+60
[+] https://<website-name>/#overview:time=25079981+60
[+] https://<website-name>/#overview:time=25079982+60
[+] https://<website-name>/#overview:time=25079983+60
[+] https://<website-name>/#overview:time=25079984+60
[+] https://<website-name>/#overview:time=25079985+60
[+] https://<website-name>/#overview:time=25079986+60
[+] https://<website-name>/#overview:time=25079987+60
[+] https://<website-name>/#overview:time=25079988+60
[+] https://<website-name>/#overview:time=25079990+60
[+] https://<website-name>/#overview:time=25079991+60
[+] https://<website-name>/#overview:time=25079992+60
[+] https://<website-name>/#overview:time=25079993+60
[+] https://<website-name>/#overview:time=25079994+60
[+] https://<website-name>/#overview:time=25079995+60
[+] https://<website-name>/#overview:time=25079996+60
[+] https://<website-name>/#overview:time=25079997+60
[+] https://<website-name>/#overview:time=25079999+60
[+] https://<website-name>/#overview:time=25080000+60
[+] https://<website-name>/#overview:time=25080001+60
[+] https://<website-name>/#overview:time=25080003+60
I've tried using several of the redundancy plugins to solve this problem, but none of them seem to help. I've tried
--scope-redundant-path-pattern='time=:5'
--scope-exclude-pattern='time=(?!25055733\+60.*)'
--scope-auto-redundant=2
--scope-auto-redundant=1
but they all seem to run indefinitely. Any ideas how to get around this?
Keyboard shortcuts
Generic
? | Show this help |
---|---|
ESC | Blurs the current field |
Comment Form
r | Focus the comment reply box |
---|---|
^ + ↩ | Submit the comment |
You can use Command ⌘
instead of Control ^
on Mac
Support Staff 1 Posted by Tasos Laskos on 08 Sep, 2017 04:30 PM
Those options refer to server-side resources identified by their URL, in your case it's the DOM that's creating an infinite amount of states based on URL fragments, which affect the client side of things.
Try using the nightlies and setting the
--scope-dom-event-limit
option. It doesn't give you as much control as the other options but it's a start, so give it a shot and let me know how it goes.2 Posted by ryan.miller on 11 Sep, 2017 05:11 AM
Thanks for the response, but that did not seem to help. I used the nightly build and tried several different values for --scope-dom-event-limit, including 5, 1, and 0, but all resulted in an infinite crawl. I also tried using --scope-dom-depth-limit and even a vector feed trained from the proxy usage, but they all exhibited the same unlimited crawl over the time parameter. I've considered passing a specific set of paths to scan if I have to, but that is obviously not ideal since it would have to be updated as the app changes and it would be nice to have the better coverage of an actual crawl. I would appreciate any other ideas of what I can try to do a crawl but avoid this issue.
3 Posted by ryan.miller on 12 Sep, 2017 01:47 AM
Would it possibly help to use the audit options to avoid this issue? I'm trying to test them out, but I'm not clear on how they work in the CLI. As in, if --audit-ui-forms and --audit-ui-inputs are enabled by default, would passing them then disable them since they don't take any parameter like false? I'm also not clear on what should be passed into the --audit-exclude-vector option, like is it the name of inputs within my app or just a shortcut for listing out Arachni's audit options by name.
Support Staff 4 Posted by Tasos Laskos on 12 Sep, 2017 04:26 PM
All audit options are enabled by default, but if you start setting your own then all others will be disabled.
From what I understand though no existing option will help your situation, I'll need to add some new ones.
Any chance I can be given access to that site in order to see the cause of the issue and also use it as a test-case?
5 Posted by ryan.miller on 12 Sep, 2017 07:58 PM
I appreciate the help if you want to take a look at the site. We need to know the IP address/range you would be coming from though in order to enable access to the dev version of our site since we would prefer not to put excess load on our prod/beta version. I also have a rb login script with test credentials I can send you that should cover the authentication portion of the test. What would be the best way to exchange that information? Should I start a private discussion?
Support Staff 6 Posted by Tasos Laskos on 19 Sep, 2017 11:10 AM
You can send that info over e-mail at: tasos[dot]laskos[at]arachni-scanner.com