excluding external static content, moving on if scanner "gets stuck"

Patrick's Avatar


08 Feb, 2013 01:25 PM


1) Is there an option to exclude scanning of external (i.e. not on same domain or sub-domain as the one being scanned) content (images, stylesheets, etc)?
Reason being, we did a scan last week where one of the pages in scope had images that were hosted on a third-party site (which were fetched repeatedly during the scan) and that caused some trouble.

2) Is there an option to move the scanner to the next page if it "gets stuck"?
We were scanning some large webpages and noticed that on some of them it would just continue performing code injections (and other tests) on the same page for many hours and not move on. This meant that the overall scan did not complete within the allocated time window.

Many thanks for your help.

  1. 1 Posted by Patrick on 08 Feb, 2013 01:51 PM

    Patrick's Avatar

    I've noticed the auto-redundant option, but I don't think that will help with question (2) ?

  2. Support Staff 2 Posted by Tasos Laskos on 08 Feb, 2013 02:47 PM

    Tasos Laskos's Avatar


    1) You can just exclude (-e) directories wich you know contain static with assets or just specify file extensions you want to skip.
    2) A fix for that is in the unstable codebase (which will soon become v0.4.2), you can grab one of the nightly builds if you want to test it and let me know how it works before I release it.

    Let me know if you need more help.

    PS. Arachni doesn't really get stuck, what you saw was the effects of the Trainer, as I call it; if the Trainer finds new elements in a given page it will audit them too -- which kind of looks like the same page is being audited multiple times.
    However, the new elements may be marked as so solely because of things like a nonce or some other dynamic (but meaningless) change to their name or URL action so from now on there's a hard limit on how many times to train from a given page.
    The limit is quite high, 25, but if you've got some feedback for me on that front I'd really like to hear it.

  3. Tasos Laskos closed this discussion on 08 Feb, 2013 02:47 PM.

  4. Patrick re-opened this discussion on 08 Feb, 2013 03:24 PM

  5. 3 Posted by Patrick on 08 Feb, 2013 03:24 PM

    Patrick's Avatar

    Great, thanks for your help. One more question:

    Just found out it's not just static content on any external sites that I need to block but any content (also html pages). So I can't use the exclude option.

    Is my best bet to use the include option and set that equal to the target URL being passed to arachni?

  6. Support Staff 4 Posted by Tasos Laskos on 08 Feb, 2013 03:47 PM

    Tasos Laskos's Avatar

    Ah wait, just noticed the "external" part, you shouldn't be able to access any resource outside of the subdomain you're scanning.
    (And even if you've used the --follow-subdomains flag you should still stay within the same domain.)

    You may have found a bug in the older version you've been using but the current code is even more strict about the scope, are you experiencing the same behavior with the nightlies?

  7. 5 Posted by Patrick on 08 Feb, 2013 03:54 PM

    Patrick's Avatar

    Yes, I did have the -f option enabled. Is it safer to turn that off?

    It definitely fetched files from an external source (completely different hostname). It seems like all external files were linked from inside a css spreadsheet, could that have anything to do with it?

    I am hoping to run the nightly build version this weekend (if I can ensure it won't hit any external sources this time...) :-)

  8. Support Staff 6 Posted by Tasos Laskos on 08 Feb, 2013 04:21 PM

    Tasos Laskos's Avatar

    Well if you want to audit resources on other subdomains then you were right to turn it on.

    As I don't know how to reproduce the issue I can't guarantee that the nightlies will behave correctly so you'll either have to sort of describe the environment your're hitting (how the CSS file is linked, what it contains, etc [1]) and of course, the version of Arachni you're currently using.

    Alternatively, you'll just have to keep an eye out when using one of the nightlies and force it to quit if something unexpected happens again.
    And then let me know and be patient while we work it out -- kind, kind user.

    [1] You can of course remove sensitive info like the actual URLs and replace them with placeholders.

  9. 7 Posted by Patrick on 08 Feb, 2013 04:43 PM

    Patrick's Avatar

    I was using the latest Arachni build from master on github. I am now in the process of updating to the experimental branch, will let you know how that goes this weekend.

    In terms of debugging, I could send you the.afr file for that scan? That might well make more sense to you than me... :-)

    I will remove sensitive information but would have to send an encrypted version by email. Would that be possible?

  10. Support Staff 8 Posted by Tasos Laskos on 08 Feb, 2013 04:51 PM

    Tasos Laskos's Avatar

    Unfortunately that wouldn't help. The AFR file is just a report and in order to see the external sources that are being pulled in and then track down how that happens they'd have to be there and not redacted.

    So that depends on your policy, if your client doesn't want anyone knowing that they are your client then both our hands are tied, otherwise it's just another website I can publicly access and crawl -- although not audit.

    But, let's give the experimental branch a shot first and go from there.

  11. 9 Posted by Patrick on 10 Feb, 2013 09:27 AM

    Patrick's Avatar

    Hi Tasos

    Good news and bad news.. :-)

    Good news is that most of the scans of those massive webhosts completed using the experimental version of arachni in 36ish hours (compared to master branch version that was still traversing pages after 48 hours). We'll look at the results tomorrow, so that's been a great help.

    Bad news is that on a few hosts arachni seems to crash during the scan. I've attached the output to stdout that the crash produces. Does Arachni log to a crashlog or is there any other file that I could send you to investigate this problem?

    I just had another look at last week's scans (where I was scanning using the master branch version of arachni) and noticed the same crash occured a few times there also. So it doesn't seem a problem that is just related to the experimental branch.

    Once again, many thanks for your help. Much appreciated!

  12. Support Staff 10 Posted by Tasos Laskos on 10 Feb, 2013 01:58 PM

    Tasos Laskos's Avatar

    Crap, that's a Ruby segfault...Please tell me you've been using Ruby 1.9.2, because if that's the case an upgrade to 1.9.3 should take care of the problem.

  13. 11 Posted by Patrick on 10 Feb, 2013 07:41 PM

    Patrick's Avatar

    Nope.. it's ruby 1.9.3p125 (2012-02-16 revision 34643) [x86_64-linux]

  14. Support Staff 12 Posted by Tasos Laskos on 10 Feb, 2013 07:42 PM

    Tasos Laskos's Avatar

    Do you happen to have the entire segfault output?

  15. 13 Posted by Patrick on 10 Feb, 2013 07:51 PM

    Patrick's Avatar

    Unfortunately not. Was running arachni in screen and this is as far as I could scroll back. Unless ruby errors get saved somewhere else?

  16. Support Staff 14 Posted by Tasos Laskos on 10 Feb, 2013 08:16 PM

    Tasos Laskos's Avatar

    No unfortunately not... The Ruby devs would probably want to have a look at it but it's missing the backtrace at the top.

    Would it be possible to make provisions so that we'll get the full segfault data next time?

  17. 15 Posted by Patrick on 11 Feb, 2013 11:28 AM

    Patrick's Avatar

    Yes I will increase the scrollback buffer so I should be able to capture the full segfault data if it happens again during scans next weekend.

    Sorry I'm not a ruby person myself, is the best to send this to http://www.ruby-lang.org/bugreport.html?

  18. Support Staff 16 Posted by Tasos Laskos on 11 Feb, 2013 01:51 PM

    Tasos Laskos's Avatar

    Thanks Patrick, and yeah you should follow the instructions on that page to report bugs to the Ruby dev team.

  19. Support Staff 17 Posted by Tasos Laskos on 11 Feb, 2013 02:16 PM

    Tasos Laskos's Avatar

    Ah, you might want to try the latest 1.9.3 revision too (p374 at the moment), I've read they've fixed a few causes for segfaults.
    And I've never come across one myself since I upgraded either (and I did use to get the occasional one) so it's worth a chance.

    PS. And the Ruby devs will advise you to test with the latest version before submitting a bug too so you better try it and go from there.

  20. 18 Posted by Patrick on 16 Feb, 2013 11:33 AM

    Patrick's Avatar

    Hi Tasos

    I've encountered that segfault again and managed to capture the whole stacktrace this time. It is nokogiri that is causing the segfault.

    I'll submit the bug to the relevant parties but if you have any thoughts on this, any help is much appreciated!

  21. 19 Posted by Patrick on 16 Feb, 2013 11:45 AM

    Patrick's Avatar

    Seems like updating to nokogiri 2.7.7 might solve the segfault problem?


    Which version does arachni experimental branch use?

  22. Support Staff 20 Posted by Tasos Laskos on 16 Feb, 2013 02:14 PM

    Tasos Laskos's Avatar

    Nokogiri is a fancy binding for libxml which seems to be the cause of the segfault.
    When not using the self-contained packages (which bundle a whole lot of crap together) Arachni will be using your system's libxml version, which in this case is: /usr/lib64/libxml2.so.2.7.6

    If you update your installed libxml library to 2.7.7 that could very well solve your problem, on the other hand, I've found that libxml versions higher than 2.8.0 cause other issues so be prepared if your OS tries to upgrade to > 2.8.0.
    In general, 2.8.0 is your best bet.

    What I do is setup an environment identical to the packages and work inside it to ensure that end-users will have the same experience as me while I was developing the thing.

    First, I setup the self-contained env:

    ARACHNI_BUILD_ENV=development wget -O - https://raw.github.com/Arachni/build-scripts/master/bootstrap.sh | bash

    Then put this in my .rvmrc:

    # Clear RVM's env overrides to make room for ours
    rvm reset
    # Get us into the package env
    export env_root=/home/<username>/arachni-build-dir/arachni/system
    source $env_root/environment
    export PATH=$env_root/gems/bin:$PATH

    If the above doesn't fit your purposes the currently preferred lib versions (which are pretty much guaranteed to work) can be seen at: https://github.com/Arachni/build-scripts/blob/master/build.sh#L91

  23. 21 Posted by Patrick on 22 Feb, 2013 01:16 PM

    Patrick's Avatar

    Having done more research on this, I think the problem I was experiencing is likely to be due to this bug in nokogiri:

    The problem was fixed in commit https://github.com/sparklemotion/nokogiri/commit/287a6ca4cc3403aad8...

    Unfortunately Team nokogiri have not yet created a new release with this bugfix and compiling nokogiri from source is not straightforward (https://github.com/sparklemotion/nokogiri/blob/master/Y_U_NO_GEMSPE...)

    Any ideas how to fix this in the short term?


  24. Support Staff 22 Posted by Tasos Laskos on 22 Feb, 2013 02:56 PM

    Tasos Laskos's Avatar

    No problem, I'll tell the Gemfile to grab Nokogiri from the repo, instead of the latest gem, until they release it.
    I just hope that there aren't any incompatibilities but I'll run the tests and let you know how it goes.

  25. Support Staff 23 Posted by Tasos Laskos on 22 Feb, 2013 03:06 PM

    Tasos Laskos's Avatar

    Sorry, just saw the link. We can maybe ask them to build an RC since this is a serious blocker.

  26. 24 Posted by Patrick on 22 Feb, 2013 03:10 PM

    Patrick's Avatar

    Yeah I posted on nokogiri-talk to ask for a RC: https://groups.google.com/forum/?fromgroups=#!forum/nokogiri-talk

    But my post hasn't been approved yet. Will let you know once they respond.

    Seems like there are more people with the same problem: https://groups.google.com/forum/?fromgroups=#!topic/nokogiri-talk/z...

  27. 25 Posted by Patrick on 22 Feb, 2013 03:40 PM

    Patrick's Avatar

    Looks like there will be a nokogiri RC very soon:


  28. Support Staff 26 Posted by Tasos Laskos on 23 Feb, 2013 03:54 AM

    Tasos Laskos's Avatar

    1.5.7.rc1 is out: https://rubygems.org/gems/nokogiri/versions/1.5.7.rc1

    Will give it a try tomorrow.

  29. Support Staff 27 Posted by Tasos Laskos on 23 Feb, 2013 05:40 PM

    Tasos Laskos's Avatar

    Tests are green, pushed to experimental.
    Re-open if there are any problems.

  30. Tasos Laskos closed this discussion on 23 Feb, 2013 05:40 PM.

  31. Patrick re-opened this discussion on 04 Mar, 2013 10:46 AM

  32. 28 Posted by Patrick on 04 Mar, 2013 10:46 AM

    Patrick's Avatar

    Hi Tasos

    Just to let you know that the bugs I reported seem to be fixed now. I ran a few relatively large scans at the weekend and all of them completed without any problems. Looks like the experimental branch might be ready for a new release soon? :-)

  33. Support Staff 29 Posted by Tasos Laskos on 04 Mar, 2013 03:29 PM

    Tasos Laskos's Avatar

    I was wondering how that was going, thanks for letting me know.

    Yep, the Framework experimental branch is pretty much stable but here's what needs to be done before releasing:

    • Implement some WebUI features I received via user feedback.
    • Rails4 needs to be released (for the new WebUI) along with some other gems to release Rails4 compatible versions.
    • Tests need to be written for the WebUI.
    • Port arachni_rpcd to use the cleaned up RPC API and support the new distributed features..

    So, it's not going be to that soon...let's say soon-ish.

  34. Tasos Laskos closed this discussion on 04 Mar, 2013 03:29 PM.

Comments are currently closed for this discussion. You can start a new one.

Keyboard shortcuts


? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac