Arachni: Discussion

--include='='

2013-07-31T14:35:20Z

http://www.juicycouture.com/ doesn't contain a '='. ;)

--include='='

2013-07-31T14:38:05Z

That's kind of dumb though, I'll check to see if there's a reason not to exclude the seed URL from the filters.

--include='='

2013-07-31T14:44:17Z

--include='='

2013-07-31T16:39:26Z

Yeah turns out there's a reason and even though it may seem weird, it semantically makes sense, doesn't it?
Basically, you're forcing it to follow something you want excluded so all in all I think it's OK to leave the behavior as is.

You can provide it with a seed URL that contains '=' and be done with it.

--include='='

2013-07-31T17:34:51Z

Providing a URL that contains '=' and which happens to be deeper in the structure, works but since crawler can't go up, it only crawls a few pages. with trainer module that number doubles, i was just wondering, i know trainer have a limit, if would remove that limit would be possible to crawl all webside from the inside out ...like a mole ?xD

--include='='

2013-07-31T18:18:39Z

Well, no not really. The only difference between the Trainer and the Spider, when it comes to discovering pages, is that the Trainer gets some responses from the audit (which includes submitted forms and discovery module results) while the Spider only performs GET requests. In this case it just so happened that the Trainer somehow got a response for a shallower page.

What I mean is that it probably won't do what you want it to do even if you remove the limit, not completely.

What you want to do is not really possible, you want to both include and exclude pages, and that would work if the structure was flat, but websites are trees. You have to follow things you don't want in order to find things you do want.

However, what you can do it is perform a crawl (just a crawl) without restrictions and then pass the AFR via the rescan plugin with the restrictions you want. That would allow the system to get a hold of a flat structure and then filter out the paths based on the filters you provide.

Let me know how it works.

--include='='

2013-07-31T20:49:54Z

Yeah it works, makes sense though why there's no other way. btw, is there alot of code i shall delete in order to remove the trainer limit ? just wondering

--include='='

2013-07-31T21:03:05Z

https://github.com/Arachni/arachni/blob/experimental/lib/arachni/tr...

You can set that constant to something like 9999999.