Sitemap does not include all of the pages

Frank's Avatar

Frank

07 Apr, 2016 10:45 PM

I'm reviewing the results of my scan, and I see the Sitemap is not complete. It is missing pages which I feel should have been discovered.

My scan:
~/Downloads/arachni-2.0dev-1.0dev/bin/arachni --report-save-path=report.afr --scope-exclude-pattern=signout --http-cookie-jar cookies.txt [redacted URL] --checks -

In one of the pages in the resulting site map, I see several links that were not discovered:

Menu Item Links
<a href="/player/my_classes" class="ng-binding">
      My Classes
      <span ng-show="isSelected('my_classes')" class="hidden-text ng-binding ng-hide" aria-hidden="true">- Selected</span>
    </a>

<a href="/player/grader_assessments" class="ng-binding">
      Grading
      <span class="js-count">
        <span class="hidden-text ng-binding">with</span>
        <span data-url="/player/grader_assessments/queue_count" style="display:none;" class="countflag js-countflag"></span>
        <span data-i18n="menu.count_desc.grading" class="hidden-text js-desc"></span>
      </span>
    </a>

<a href="/player/kb_elements?reset_context=true" class="ng-binding">
      Answer Forum
      <span he-can-access="PERMISSIONS.ANSWER_FORUM.POST_AS_INSTRUCTOR" class="js-count ng-hide">
        <span class="hidden-text ng-binding">with</span>
        <span data-url="/player/kb_elements/unanswered_or_flagged_count" style="" class="countflag js-countflag"></span>
        <span data-i18n="menu.count_desc.answer_forum" class="hidden-text js-desc"></span>
        <span ng-show="isSelected('answer_forum')" class="hidden-text ng-binding ng-hide" aria-hidden="true">- Selected</span>
      </span>
    </a>

Other Links
<a ui-sref="course_detail({courseId: course.id})" href="/player/ng/courses/202">

How does Arachni discover new pages?

  1. Support Staff 1 Posted by Tasos Laskos on 08 Apr, 2016 08:20 AM

    Tasos Laskos's Avatar

    I'm guessing the data-url ones are missing right?

  2. 2 Posted by Frank on 08 Apr, 2016 02:31 PM

    Frank's Avatar

    They are all missing:

    /player/my_classes /player/grader_assessments /player/grader_assessments/queue_count /player/kb_elements?reset_context=true /player/kb_elements/unanswered_or_flagged_count /player/ng/courses/202

  3. Support Staff 3 Posted by Tasos Laskos on 08 Apr, 2016 03:47 PM

    Tasos Laskos's Avatar

    This should be fixed in the nightlies, give them a try and let me know how they do.

    Cheers

  4. 4 Posted by Frank on 08 Apr, 2016 04:24 PM

    Frank's Avatar

    About to download and test...

  5. 5 Posted by Frank on 08 Apr, 2016 06:16 PM

    Frank's Avatar

    I tried the nightly, and none of those links appeared in my sitemap.

  6. Support Staff 6 Posted by Tasos Laskos on 08 Apr, 2016 06:17 PM

    Tasos Laskos's Avatar

    Still less than you were expecting though?
    I mean, was the issue resolved?

  7. 7 Posted by Frank on 08 Apr, 2016 09:29 PM

    Frank's Avatar

    The issue was not resolved. What I sent was an example of SOME of the links I expected to see in the sitemap, and none of those example links showed up, as well as the rest.

  8. Support Staff 8 Posted by Tasos Laskos on 08 Apr, 2016 09:33 PM

    Tasos Laskos's Avatar

    My bad, no idea why I read the exact opposite of what you wrote.

    Unfortunately, I can't do much without access to the webapp, in order to reproduce the issue and identify the problem.
    Would that be possible?

  9. 9 Posted by Frank on 08 Apr, 2016 10:14 PM

    Frank's Avatar

    I'm not able to provide you access.

    What I did notice is that all of those links are within this div:
    <div id="main-content" ui-view></div>

    When viewing the source, the div is empty, but within a browser, it is populated with the content.

    Could this be an issue with phantomjs and angularjs?

  10. Support Staff 10 Posted by Tasos Laskos on 09 Apr, 2016 08:14 AM

    Tasos Laskos's Avatar

    I really can't say, although I'm not aware of any issues with AngularJS.
    Any chance you can provide me with a demo webapp that behaves similarly?

  11. 11 Posted by Frank on 09 Apr, 2016 04:23 PM

    Frank's Avatar

    I am not the developer of the app, nor have access to the code to reproduce something similar.

    Now that I have identified that all of the content, including links, are within this div:
    <div id="main-content" ui-view></div>, i recall seeing a debug message that stated there was an issue parsing #main-content. If that is the case, then it makes sense if the sitemap was not complete.

    Unfortunately, the site is down and will not be back up until Monday. At that time I will try to reproduce the error.

  12. Support Staff 12 Posted by Tasos Laskos on 09 Apr, 2016 04:26 PM

    Tasos Laskos's Avatar

    If it mentioned parsing then that must have been a URL parsing thing, it won't be related.
    I don't think I'll be able to help without access to the webapp.

  13. 13 Posted by Frank on 11 Apr, 2016 03:14 PM

    Frank's Avatar

    As mentioned, all of the missing links are within the div with id="main-content".

    Here is the error message:

    PhantomJS is launching GhostDriver...
    895: Working
    [INFO  - 2016-04-11T14:42:54.475Z] GhostDriver - Main - running on port 20697
    
     [!] [browser#spawn_phantomjs:1368] BrowserCluster Worker#70104961570780: PhantomJS is ready.
     [*] BrowserCluster: Spawned #6 with PID 896 [lifeline at PID 895].
     [*] BrowserCluster: Initialization completed with 6 browsers in the pool.
    
     [*] [HTTP: 200] [redacted URL]
     [~] Identified as: linux, nginx
     [!] [uri#parse:113] Failed to parse '#main-content'.
     [!] [uri#parse:114] Error: Failed to parse URL.
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/uri.rb:418:in `initialize'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/uri.rb:111:in `new'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/uri.rb:111:in `block in parse'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/support/cache/base.rb:108:in `call'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/support/cache/base.rb:108:in `fetch'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/uri.rb:109:in `parse'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/uri.rb:321:in `to_absolute'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/utilities.rb:153:in `to_absolute'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/parser.rb:139:in `to_absolute'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/parser.rb:382:in `block in run_extractors'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/usr/lib/ruby/2.2.0/set.rb:283:in `each_key'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/usr/lib/ruby/2.2.0/set.rb:283:in `each'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/parser.rb:381:in `map'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/parser.rb:381:in `run_extractors'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/parser.rb:353:in `paths'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/page.rb:307:in `paths'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/framework/parts/data.rb:207:in `push_paths_from_page'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/framework/parts/audit.rb:98:in `audit_page'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/framework/parts/audit.rb:223:in `audit_queues'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/framework/parts/audit.rb:197:in `block in audit'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/framework/parts/audit.rb:177:in `loop'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/framework/parts/audit.rb:177:in `audit'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/framework.rb:117:in `block in run'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/utilities.rb:425:in `call'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/utilities.rb:425:in `exception_jail'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/lib/arachni/framework.rb:117:in `run'
     [!] [uri#parse:115] /Users/Downloads/arachni-2.0dev-1.0dev/system/gems/bundler/gems/arachni-6888521525fb/ui/cli/framework.rb:63:in `block in run'
     [~] Analysis resulted in 6 usable paths.
     [~] DOM depth: 0 (Limit: 5)
    

    As I mentioned, I can't give you access to the site, but could probably provide you with the HTML of the page that is being parsed.

  14. Support Staff 14 Posted by Tasos Laskos on 11 Apr, 2016 03:21 PM

    Tasos Laskos's Avatar

    Yeah that's just from the URL parser, it doesn't have anything to do with the problem, it's coincidental.
    Unfortunately, there's nothing I can do without access so I'm closing this discussion.
    If something changes feel free to re-open it.

    Cheers

  15. Tasos Laskos closed this discussion on 11 Apr, 2016 03:21 PM.

  16. Tasos Laskos re-opened this discussion on 14 Apr, 2016 10:46 AM

  17. Support Staff 15 Posted by Tasos Laskos on 14 Apr, 2016 10:46 AM

    Tasos Laskos's Avatar

    It just occurred to me, you may need to set this option: https://github.com/Arachni/arachni/wiki/Command-line-user-interface...

  18. Tasos Laskos closed this discussion on 18 Apr, 2016 08:06 PM.

  19. Frank re-opened this discussion on 19 Apr, 2016 10:21 PM

  20. 16 Posted by Frank on 19 Apr, 2016 10:21 PM

    Frank's Avatar

    I tried the option two ways:

    --browser-cluster-wait-for-element='URL:#main-content'

    and

    --browser-cluster-wait-for-element='^((?!#).)*$:#main-content'

    I still received an incomplete sitemap and the log shows:

    [!] [uri#parse:113] Failed to parse '#main-content'.
    [!] [uri#parse:114] Error: Failed to parse URL.
    
  21. Support Staff 17 Posted by Tasos Laskos on 20 Apr, 2016 09:29 AM

    Tasos Laskos's Avatar

    Well, it was worth a try.

  22. Tasos Laskos closed this discussion on 20 Apr, 2016 09:29 AM.

  23. Tasos Laskos re-opened this discussion on 20 Apr, 2016 09:32 AM

  24. Support Staff 18 Posted by Tasos Laskos on 20 Apr, 2016 09:32 AM

    Tasos Laskos's Avatar

    No, wait, main-content is always there, it's just empty by default, set the option to wait for something inside the div, something that appears last so that the entire page will have loaded.

  25. Tasos Laskos closed this discussion on 04 May, 2016 05:42 PM.

Comments are currently closed for this discussion. You can start a new one.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac