HTML::TableExtract - plz review my readymade code

good day dear community -

finally back again!

I i want to parse a site: I want to learn something with this process. Please give me a **helping hand and review my code! **

Weitere Schulinformationen

A very simple site with only one table!

i decided to do this with this Module: use HTML::TableExtract; - hope this does the trick and does the parsing well. See the Module-site at CPAN: HTML::TableExtract - search.cpan.org

TableExtract is a great tool - and i am pretty sure that it does a great job!

We need to provide something that uniquely identifies the table in question.
This can be the content of its headers or the HTML attributes. In this (above mentioned) case, there is only one table in the document (gardez - see the link above), so we don’t even need to do that. But, we should provide anything to the constructor, Why not providing the class of the table.

Also, We should not do the columns of the table. Have a look; The first column of this table consists of labels and the second column consists of values. Lets have a look at the table: To get the labels and values at the same time, we should process the table row-by-row.


#!/usr/bin/perl

use strict; use warnings;
use HTML::TableExtract;
use YAML;

my $te = HTML::TableExtract->new(
    attribs => { class => 'bp_ergebnis_tab_info' },
);

$te->parse_file('t.html');

for my $table ( $te->tables ) {
    print Dump $table->columns;
}



What do you think

a. is the code appropiate?
b. do i have the right thiougts - can i make the results a bit cleaner!?

  • i love to hear from you!

HTML is flat text files with structure. I’m not sure by what your posting what your intent really is:

  1. Is end goal to
    …a) just print results? of an html page
    …b) print results of several similar html pages or for dis-similar ones
    …c) place results in file(s)
    …d) place results into some special database
  2. Result formatting
    …a) Keep html in results
    …b) strip off html and create as clean data.

There is many ways to do things, they just need to have the end result kept in mind. In the coarse of creating an application for example, my code reads html pages from the web amd compares them to local copies and to a database. It then based upon results either adds to the pages and reposts them or creates new pages and links them into the existing ones.

hello Techwiz

many thanks it is a great answer. I want to store the results in a MySQL-database. Therefore i need to use perl-dbi. I hope that this is not too difficult. In this example i need to store 9 or ten lines of text. Note - i have about 1000 files with that kind of result. So it is necessary (and worth ) to store the results in a database. I will try and figure out later today. I come back and report all my findings. Regards dilbertone

Using perl-dbi is one of over 1000 ways you can generate a MySQL-database. So if perl-dbi proves too difficult for the parsing and storage, alternatives could be multistage scripts using pipes, writing a ‘C’ program, using python, glade, QT, gtk, kbasic, xBasic, ruby, pascal, assembly … and the list goes on.

good luck.

i do not think i was ready for you to put words in quotation marks. i’ll think about what you really want when you put your search in quotes and try to return something that makes sense.


smith

each time a description of the change, there is always plenty to talk about around the world. they are not exempt. a statement usually are: "i have to change almost immediately versions, so do not expect me to date continue to attack …


70-647
640-816
70-290
350-018