good evening dear friends, hello all Linux-fans!
please review the Dom-Element-parser: my first scraper [with online-demo] see below!
Note I’m trying to learn the basics PHP and Perl and all things like that. And yes i’m still very new to programming. I work on arrays and the storage of the data in a database!
I am interested in good sites that teach me the learn the very basics, say with some examples as well? And yes - this site is a great place! I have seen great examples of code and very interesting questions.
Code that is very interesting. Well at the moment i work on a little script with Domelement and parser i have a question: How can the parser be [re-] designed and extended that it takes a list of all modules and their creators?
This said it is obvious that the parser goes from this page
The CPAN Search Site - search.cpan.org
to sub-pages and gets the information on Perl-Module and Creator … see the following;
David Warring - search.cpan.org
Chad Wagner - search.cpan.org
Walter Higgins - search.cpan.org
David Warring - search.cpan.org
see the script in an online demo here
<?php
######################################
# Basic PHP scraper
######################################
require 'scraperwiki/simple_html_dom.php';
$html = scraperwiki::scrape("http://search.cpan.org/author/?W");
print $html;
# Use the PHP Simple HTML DOM Parser to extract <td> tags
$dom = new simple_html_dom();
$dom->load($html);
foreach($dom->find('td') as $data)
{
# Store data in the datastore
print $data->plaintext . "
";
scraperwiki::save(array('data'), array('data' => $data->plaintext));
}
?>
Well that looks nice - but i want to have the behaviour that is described above.
At the moment i want to learn more about handling the data. To print the data on the screen is nice.
if i have to extend the explanation- just let me know!
love to hear from you
dilbert_one:)
**btw: **But besides this it would be great to do the storing of the data in an arrays and the storage of the data in a database!