Python script not working properly

Hello!
I’m trying to run a tool subdomainer on Opensuse 11.2. The script is not working properly. I use the following commands:
python subdomainer.py -d microsoft.com -l 50 -m google

The script executes and then displays the following


*Subdomainer Ver. 1.3b *
*Coded by Christian Martorella *
*Edge-Security Research *
*laramies2k@yahoo.com.ar *


Searching for microsoft.com in google

No results were found.

It displays no results message for every domain I try. Is there some missing library which the script requires? I know my syntax is correct. I don’t know what else the problem could be.

Guys I don’t think it’s a tough question. Just try once and you’ll get the answer. I didn’t get it because I don’t have a programming background.

Hi
Two year old code, I’m guessing the websites have changed… as the
developer to update?


Cheers Malcolm °¿° (Linux Counter #276890)
SUSE Linux Enterprise Desktop 11 (x86_64) Kernel 2.6.32.12-0.7-default
up 7 days 14:28, 3 users, load average: 0.00, 0.03, 0.00
GPU GeForce 8600 GTS Silent - Driver Version: 256.35

No, the code hasn’t changed. Here’s the source code
http://www.edge-security.com/soft/subdomainer.py

Hi
That’s what I meant, with the code, it possibly needs updating as
things may have changed on the google, yahoo etc sites so the code
doesn’t work as expected.


Cheers Malcolm °¿° (Linux Counter #276890)
SUSE Linux Enterprise Desktop 11 (x86_64) Kernel 2.6.32.12-0.7-default
up 7 days 15:47, 3 users, load average: 0.18, 0.11, 0.03
GPU GeForce 8600 GTS Silent - Driver Version: 256.35

On 06/24/10 21:59, malcolmlewis wrote:
>

> Hi
> That’s what I meant, with the code, it possibly needs updating as
> things may have changed on the google, yahoo etc sites so the code
> doesn’t work as expected.

E.g. with the code:

def howmanygoo(w):
h = httplib.HTTP(‘www.google.com’)
h.putrequest(‘GET’,“/search?num=10&hl=en&btnG=B%C3%BAsqueda+en+Google&meta=&q="”+w+“"”)
h.putheader(‘Host’, ‘www.google.com’)
h.putheader(‘User-agent’, 'Internet Explorer 6.0 ')
h.endheaders()
returncode, returnmsg, headers = h.getreply()
data=h.getfile().read()
r1 = re.compile(‘about <b>[0123456789,]*</b> for’)
^^^^

Whereas Google returns (for ‘microsoft.com’):
“About 29,900,000 results (0.29 seconds)”
So the regex fails (no ’ for’) and thus returns zero…

OP needs to look at what the search pages return, and update the re.compile lines accordingly.
Lukely Python is a very easy language to get to know, so editing this simple program shouldn’t be
hard.

Theo

You can also query Google (and probably other search engines as well) using SOAP requests. You’ll have cleaner code and I guess it is more likely to continue working properly over a period of time.

I see a couple of problems with the script (I just tried using it for google). First, I got nothing back from :

data=h.getfile().read()

around line 114 in def howmanygoo(w)
so I changed it to

data=h.getfile().readlines()[0]

Now, this is not ideal (will throw an IndexError if the list is empty or an AttributeError if it’s a NoneType) but it serves to illustrate that getfile().read() was either not being used properly or was simply not working. Either way, with readlines() I got a list with one entry - that being a very long string containing the html of the google search page; this is (I assume), what the author was trying to do in the first place.

The next problem is the hardcoded regular expression that looks for

re.compile(‘about <b>[0123456789,]*</b> for’)

Google have evidently changed their formatting since the script was written. I imagine that the author was looking for :

<div id=resultStats>About 306,000,000 results<nobr> (0.07 seconds) </nobr></div>

which is in the html returned from the Google search.

So I imagine changing the regular expression would sort it out - at least until the formatting is changed again by Google.

The best solution would probably be to rewrite the script using available APIs from Yahoo, Google etc. At least that way it probably wouldn’t be subject to as much breakage if the vendor changes the API…