I load a lot of data from many sources into a database for further analysis. One of the common forms of data files is an excel spreadsheet. A client's system may produce a daily log of activity, and I need to aggregate that into one place to study long term trends.
Storing the files in a sub-directory (folder to you windows users), and running a perl to load all those files into a database is an easy way to get the data where I need it. See last weeks post for the perl to cycle through all files in a sub-directory.
You'll need to add this statement after the "use strict;" statement:
use Spreadsheet::ParseExcel;
You'll notice a couple of comments in the code about "having your way with it". Once of the things you can do is load the excel file into arrays for later fun stuff. Here's the code you can insert at either of these comments:
my $workbook = Spreadsheet::ParseExcel::Workbook->Parse($CurrentFile);
my($row, $col, $sheet, $WkC);
foreach my $sheet (@{$workbook->{Worksheet}}) {
for (my $row = $sheet->{MinRow}; defined $sheet->{MaxRow} && $row <= $sheet->{MaxRow} ; $row++) {
for (my $col = $sheet->{MinCol}; defined $sheet->{MaxCol} && $col <= $sheet->{MaxCol} ; $col++) {
# ----- Header Row
if ($row == $sheet->{MinRow}) {
my $column_name = $sheet->{Cells}[0][$col]->{Val};
$FieldName[$col] = $column_name;
} else {
$WkC = $sheet->{Cells}[$row][$col];
# ----- you could decide to skip a column
# ----- if there's no field name in row 1
# next if ( !(defined($FieldName[$col])) || ($FieldName[$col] eq '') ); # skip row if it doesn't have a header
# ----- cleanup any data here
if (($WkC) && ($WkC->Value ne '') && ($FieldName[$col] eq 'SomeSpecificFieldName')) {
my $temp_value =$WkC->Value;
$temp_value =~ s/\'/\'\'/g; # deal with stinking '
# ----- Here we might do something interesting with each cell data
}
# ----- Here we can do something interesting with every cell data
} # end of else for row 0 handling
} # end of column handling
# ----- Now that you have the row loaded,
# ----- You can do something interesting with it here
} # end of row handling
} # ----- end of worksheet handling
I hope you find this helpful.