I will give the quick of what it does and then Ill break it down for ya...
Disallowz is a project I just threw up on bitbucket that consists of some scripts that I wrote to do the following:
The verify script:
Looks up the domain's A record to verify that the domain even exists.
Sends a web request for that domain and analyzes the server response codes.
If Redirects are found then it will report to you the final destination. (this is needed for the rob0tz script)
The rob0tz script:
rob0tz will ask the user for the full url path to make sure it can make the right request. If you do not know the full url path then you can use the verify script and it will do the work for you. rob0tz will request the sites robots.txt file and analyze the 'Disallow:' filters. It will then generate a masterlist.txt based on what it finds and will then send requests for those directories or files. The results are reported back to the user. This is very useful for people wanting to go through large robots.txt files or just get a quick look at which items are good, forbidden, not found, etc etc etc.
Anyhow, its not perfect and being that people can put in all kinds of crazy stuff into robots.txt, im sure it wont be perfect but it has worked well for me. Please let me know if you have any questions!