Trouble handling data from parsed file
That is hundreds of lines of code. If you are just trying to extract the table out of the HTML, use HTML::TableExtract.
For example:
#!/usr/bin/perl
use strict; use warnings;
use HTML::TableExtract;
use YAML;
my $te = HTML::TableExtract->new(
headers => [
'',
'novel miRNAs',
'known miRBase miRNAs',
'',
'',
],
slice_columns => 0,
);
$te->parse_file('t.html');
for my $table ( $te->tables ) {
for my $row ( $table->rows ) {
print Dump $row;
}
}
Parsing a HTML table in Perl
You need to use subTree.
#!/usr/bin/env perl
use warnings;
use strict;
use HTML::TagParser;
my $html = HTML::TagParser->new( 'foo.html' ); # Change this to your file
my $nrow = 0;
for my $tr ( $html->getElementsByTagName("tr" ) ) {
my $ncol = 0;
for my $td ( $tr->subTree->getElementsByTagName("td") ) {
print "Row [$nrow], Col [" . $ncol++ . "], Value [" . $td->innerText() . "]\n";
}
$nrow++;
}
Produces the following output (notice that the th rows are omitted):
Row [1], Col [0], Value [1027]
Row [1], Col [1], Value [21cs_337]
Row [1], Col [2], Value [0]
Row [1], Col [3], Value [catch-all caught]
Row [1], Col [4], Value [reason]
Row [2], Col [0], Value [10288]
Row [2], Col [1], Value [21cs_437]
Row [2], Col [2], Value [0]
Row [2], Col [3], Value [badfetch]
Row [2], Col [4], Value [reason]
Parsing data with perl- capturing a range of text
This is my Final solution. In this particular case I'm searching for all switchports that have a maximum port-security not equal to 1. This is just an example and can be switched for any configuration. I'm also omitting certain interfaces from being caught if that configuration is actually applied to them.
#!/usr/bin/perl
$MDIR='/currentConfig';
#list of interfaces you don't want to see to filter output
@omit =(
'MANAGEMENT.PORT',
'sup.mgmt',
'Internal.EtherSwitch',
'Router',
'ip address \d',
'STRA'
);
#join with '|' to form the regex
$dontwant = join('|',@omit);
#search criteria
$search='switchport port-security maximum [^1]';
opendir(DIR,$MDIR) or die $!;
@dirContents=readdir DIR;close DIR;
foreach $file (@dirContents) {
open(IN,$MDIR.'/'.$file) or die $!;
#record seperator to !
$/='!';
my @inFile=<IN>; close IN;
#since the record seperator has been changed, '^' won't match beginning of line
my @ints = grep (/\ninterface/i,@inFile);
#set record seperator back to normal
$/="\n";
foreach $int (@ints) {
if ( $int =~ m/$search/i && $int !~ m/$dontwant/) {
push(@finalint,$int);
}
}
}
#just list the interfaces found, i'll use this to make it comma seperated
foreach $elem (@finalint) {
print $elem;
}
How can I extract HTML table data using Perl?
HTML::TableExtract sounds exactly like what you are looking for.
Is there a parser for the output from Perl's Text::Table?
It seems somewhat convoluted. If you had the information before converting it into a table, then why try to parse it from its presentation form? It's like having a text file, converting it to latex, then to postscript, and then trying to get the text back from the postscript file.
I'm sure there's a way to parse the output of Text::Table, but it seems that your workflow is flawed; I'd aim at using a simpler output for the data (besides Text::Table, if you really have to have it that way) like YAML that can then be trivially restored to the original data structure.
How to use the Perl TableExtract rows method when there are duplicate Header fields
I found the answer: It is necessary to add the "slice_columns=> 0" attribute to the HTML::TableExtract constructor.
I'm not exactly sure why this is necessary. The help for TableExtract at CPAN says "Columns that are not beneath one of the provided headers will be ignored unless slice_columns was set to 0. Columns will, by default, be rearranged into the same order as the headers you provide (see the automap parameter for more information) unless slice_columns is 0."
In my table, every column is under a provided header. There must be an interaction in the case where headers are not unique, and setting slice_columns to 0 avoids the issue.
my $te = HTML::TableExtract->new(
headers => \@headers,
slice_columns=> 0
);
Related Topics
Equal Width Flex Items Even After They Wrap
How to Modify the Fill Color of an Svg Image When Being Served as Background Image
How to Add a Custom Attribute to an HTML Tag
Submitting a Form by Pressing Enter Without a Submit Button
Curve Bottom Side of the Div to the Inside With Css
Table Fixed Header and Scrollable Body
Is There a Float Input Type in Html5
How to Set an Opacity Only to the Background Image of a Div
Float:Left; VS Display:Inline; VS Display:Inline-Block; VS Display:Table-Cell;
Using Href Links Inside ≪Option≫ Tag
How to Style Even and Odd Elements
How to Set the Equivalent of a Src Attribute of an Img Tag in Css
Flex Container Min-Height Ignored in Ie