Sitemap does not include all of the pages
I'm reviewing the results of my scan, and I see the Sitemap is not complete. It is missing pages which I feel should have been discovered.
My scan:
~/Downloads/arachni-2.0dev-1.0dev/bin/arachni
--report-save-path=report.afr --scope-exclude-pattern=signout
--http-cookie-jar cookies.txt [redacted URL] --checks -
In one of the pages in the resulting site map, I see several links that were not discovered:
Menu Item Links
<a href="/player/my_classes" class="ng-binding">
My Classes
<span ng-show="isSelected('my_classes')" class="hidden-text ng-binding ng-hide" aria-hidden="true">- Selected</span>
</a>
<a href="/player/grader_assessments" class="ng-binding">
Grading
<span class="js-count">
<span class="hidden-text ng-binding">with</span>
<span data-url="/player/grader_assessments/queue_count" style="display:none;" class="countflag js-countflag"></span>
<span data-i18n="menu.count_desc.grading" class="hidden-text js-desc"></span>
</span>
</a>
<a href="/player/kb_elements?reset_context=true" class="ng-binding">
Answer Forum
<span he-can-access="PERMISSIONS.ANSWER_FORUM.POST_AS_INSTRUCTOR" class="js-count ng-hide">
<span class="hidden-text ng-binding">with</span>
<span data-url="/player/kb_elements/unanswered_or_flagged_count" style="" class="countflag js-countflag"></span>
<span data-i18n="menu.count_desc.answer_forum" class="hidden-text js-desc"></span>
<span ng-show="isSelected('answer_forum')" class="hidden-text ng-binding ng-hide" aria-hidden="true">- Selected</span>
</span>
</a>
Other Links
<a ui-sref="course_detail({courseId: course.id})" href="/player/ng/courses/202">
How does Arachni discover new pages?
Comments are currently closed for this discussion. You can start a new one.
Keyboard shortcuts
Generic
? | Show this help |
---|---|
ESC | Blurs the current field |
Comment Form
r | Focus the comment reply box |
---|---|
^ + ↩ | Submit the comment |
You can use Command ⌘
instead of Control ^
on Mac
Support Staff 1 Posted by Tasos Laskos on 08 Apr, 2016 08:20 AM
I'm guessing the
data-url
ones are missing right?2 Posted by Frank on 08 Apr, 2016 02:31 PM
They are all missing:
/player/my_classes /player/grader_assessments /player/grader_assessments/queue_count /player/kb_elements?reset_context=true /player/kb_elements/unanswered_or_flagged_count /player/ng/courses/202
Support Staff 3 Posted by Tasos Laskos on 08 Apr, 2016 03:47 PM
This should be fixed in the nightlies, give them a try and let me know how they do.
Cheers
4 Posted by Frank on 08 Apr, 2016 04:24 PM
About to download and test...
5 Posted by Frank on 08 Apr, 2016 06:16 PM
I tried the nightly, and none of those links appeared in my sitemap.
Support Staff 6 Posted by Tasos Laskos on 08 Apr, 2016 06:17 PM
Still less than you were expecting though?
I mean, was the issue resolved?
7 Posted by Frank on 08 Apr, 2016 09:29 PM
The issue was not resolved. What I sent was an example of SOME of the links I expected to see in the sitemap, and none of those example links showed up, as well as the rest.
Support Staff 8 Posted by Tasos Laskos on 08 Apr, 2016 09:33 PM
My bad, no idea why I read the exact opposite of what you wrote.
Unfortunately, I can't do much without access to the webapp, in order to reproduce the issue and identify the problem.
Would that be possible?
9 Posted by Frank on 08 Apr, 2016 10:14 PM
I'm not able to provide you access.
What I did notice is that all of those links are within this div:
<div id="main-content" ui-view></div>
When viewing the source, the div is empty, but within a browser, it is populated with the content.
Could this be an issue with phantomjs and angularjs?
Support Staff 10 Posted by Tasos Laskos on 09 Apr, 2016 08:14 AM
I really can't say, although I'm not aware of any issues with AngularJS.
Any chance you can provide me with a demo webapp that behaves similarly?
11 Posted by Frank on 09 Apr, 2016 04:23 PM
I am not the developer of the app, nor have access to the code to reproduce something similar.
Now that I have identified that all of the content, including links, are within this div:
<div id="main-content" ui-view></div>
, i recall seeing a debug message that stated there was an issue parsing#main-content
. If that is the case, then it makes sense if the sitemap was not complete.Unfortunately, the site is down and will not be back up until Monday. At that time I will try to reproduce the error.
Support Staff 12 Posted by Tasos Laskos on 09 Apr, 2016 04:26 PM
If it mentioned parsing then that must have been a URL parsing thing, it won't be related.
I don't think I'll be able to help without access to the webapp.
13 Posted by Frank on 11 Apr, 2016 03:14 PM
As mentioned, all of the missing links are within the div with id="main-content".
Here is the error message:
As I mentioned, I can't give you access to the site, but could probably provide you with the HTML of the page that is being parsed.
Support Staff 14 Posted by Tasos Laskos on 11 Apr, 2016 03:21 PM
Yeah that's just from the URL parser, it doesn't have anything to do with the problem, it's coincidental.
Unfortunately, there's nothing I can do without access so I'm closing this discussion.
If something changes feel free to re-open it.
Cheers
Tasos Laskos closed this discussion on 11 Apr, 2016 03:21 PM.
Tasos Laskos re-opened this discussion on 14 Apr, 2016 10:46 AM
Support Staff 15 Posted by Tasos Laskos on 14 Apr, 2016 10:46 AM
It just occurred to me, you may need to set this option: https://github.com/Arachni/arachni/wiki/Command-line-user-interface...
Tasos Laskos closed this discussion on 18 Apr, 2016 08:06 PM.
Frank re-opened this discussion on 19 Apr, 2016 10:21 PM
16 Posted by Frank on 19 Apr, 2016 10:21 PM
I tried the option two ways:
--browser-cluster-wait-for-element='URL:#main-content'
and
--browser-cluster-wait-for-element='^((?!#).)*$:#main-content'
I still received an incomplete sitemap and the log shows:
Support Staff 17 Posted by Tasos Laskos on 20 Apr, 2016 09:29 AM
Well, it was worth a try.
Tasos Laskos closed this discussion on 20 Apr, 2016 09:29 AM.
Tasos Laskos re-opened this discussion on 20 Apr, 2016 09:32 AM
Support Staff 18 Posted by Tasos Laskos on 20 Apr, 2016 09:32 AM
No, wait,
main-content
is always there, it's just empty by default, set the option to wait for something inside thediv
, something that appears last so that the entire page will have loaded.Tasos Laskos closed this discussion on 04 May, 2016 05:42 PM.