good morning dear community
I’m pretty new to programming and i am trying to learn the basics of the PERL.At the moment i digg into the Perl LWP::UserAgent. note; This code below runs and give me back the content of the parsed site: what i want is to
#!/usr/bin/perl
use strict; #
use warnings; #
use diagnostics; #
use LWP::UserAgent;
$ua = LWP::UserAgent->new;
$ua->agent("$0/0.1 " . $ua->agent);
# $ua->agent("Mozilla/8.0") # pretend we are very capable browser
$req = HTTP::Request->new(GET => 'http://dms-schule.bildung.hessen.de/suchen/suche_schul_db.html?show_school=5503');
$req->header('Accept' => 'text/html');
# send request
$res = $ua->request($req);
# check the outcome
if ($res->is_success) {
print $res->content;
} else {
print "Error: " . $res->status_line . "
";
}
as mentioned above: the code runs well and nicely: i want to build in a loop to fetch more pages. Well i want to fetch pages
**from **
Bildungsserver Hessen - Datenbank aller Schulen in Hessen**show_school=**01
**to **
Bildungsserver Hessen - Datenbank aller Schulen in Hessenshow_school=10000
- the one that have no results i want to drop (but that has to be done later with some additional code. For the proof of concept i want to get all the urls - let us say printed out that the LWP::userAgent fetches…
the quesions are:
- how to enter the loop in correct way.
- how to make the prorgamme to print out all the URLs that are fetched.
(later on i want to parse the sites with content) but thats a part that i have do design and code later on.
Here the code that has a build in loop - to make USER-Agent to itterate over a bunch of targets.
# first get a list of all schools
my $ua = LWP::UserAgent->new;
$ua->agent("Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7");
#pretending to be firefox on linux.
for my $i (0..10000) {
my $request = HTTP::Request->new(GET => sprintf("http://dms-schule.bildung.hessen.de/suchen/suche_schul_db.html?show_school=5503,%d", $i));
$request->header('Accept' => 'text/html');
my $response = $ua->request($request);
if ($response->is_success) {
$pagecontent = $response -> content;
}
# now we can do whatever with the $pagecontent
}
my $request = POST $url,
# check the outcome
if ($res->is_success) {
print $res->content; # please print out all the URLS that were fetched! Thx my dear!
} else {
print "Error: " . $res->status_line . "
";
}
do you have any idea how to insert the loop correctly - and how to get the programme to print out all the urls (not the content)!!! Please let me know if i have do be more descriptive!
dilbertone:)