How to Read Data from Excel Sheet in Linux Using Shell Script

How to read excel file in shell script

Using linux you have several choices, but none without using a script language and most likely installing an extra module.

Using Perl you could read Excel files i.e. with this module:
https://metacpan.org/pod/Spreadsheet::Read

Using Python you might want to use:
https://pypi.python.org/pypi/xlrd

And using Ruby you could go for:
https://github.com/zdavatz/spreadsheet/blob/master/GUIDE.md

So whatever you prefer, there are tools to help you.

CSV Format

If you can get your data as CSV (comma separated values) file, then it is even easier, because no extra modules are needed.

For example in Perl, you could use Split function. Now that i roughly know the format of your CSV file, let me give you a simple sample:

#!/usr/bin/perl
use strict;
use warnings;

# put full path to your csv file here
my $file = "/Users/daniel/dev/perl/test.csv";

# open file and read data
open(my $data, '<', $file) or die "Could not read '$file' $!\n";

# loop through all lines of data
while (my $line = <$data>) {

# one line
chomp $line;

# split fields from line by comma
my @fields = split "," , $line;

# get size of split array
my $size = $#fields + 1;

# loop through all fields in array
for (my $i=0; $i < $size; $i++) {

# first element should be user
my $user = $fields[$i];
print "User is $user";

# now check if there is another field following
if (++$i < $size) {

# second field should be loop
my $loop = $fields[$i];
print ", Loop is $loop";

# now here you can call your command
# i used "echo" as test, replace it with whatever
system("echo", $user, $loop);

} else {
# got only user but no loop
print "NO LOOP FOR USER?";
}
print "\n";
}
}

So this goes through all lines of your CSV file looks for User,Loop pairs and passes them to a system command. For this sample i used echo but you should replace this with your command.

Looks like i did you homework :D

Adding data to excel from text file using shell

Assuming your output1 and output2 are in files file1.txt and file2.txt and last line of output1 can be ignored:

paste -d"," file1.txt file2.txt > mergedfile.csv

Shell script to read and update a csv file


while read f1 f2
do
test $i -eq 1 && ((i=i+1)) && continue
[ -z "$f1" ] && continue
echo "tenantID is :$f1"
echo "status is :$f2"

#The below command will update the schema with data

java -Xmx512m -XX:MaxPermSize=128m -jar biz.jar -install -readers=seed - delegator=default#$f1

echo $f1, XXX >> TenantIdOut.csv
done < TenantId.csv

Added 2 lines of code:

  1. If the f1 is null, continue - This is to skip blank lines.
  2. Write the record f1 and new status to a new file , and outside the loop you can rename the input file with this new file.

Writing to an excel sheet using Bash

You could write excel by bash, perl, python, .. I think that each program language has its solutions.

bash

You could use join or awk, and I think that there are other solutions.

join

If you want join to files with same column, look these posts: Bash join command and join in bash like in SAS

awk

You could write a csv, but you could rename into xls and then with excel, gnumeric, or other programs, it is recognized like xls.

ls -R -ltr / | head -50 | awk '{if ($5 >0) print $5,$9}' OFS="," > sample.xls

when you modify xls with excel, gnumeric, or other programs, and save in xls,
you could not read by bash. So that @Geekasaur recommended perl or python solutions.

perl

You could write xls in perl, follow a sample:

#!/usr/bin/perl
use Spreadsheet::WriteExcel;
my $workbook = Spreadsheet::WriteExcel->new("test.xls");
my $worksheet = $workbook->add_worksheet();
open(FH,"<file") or die "Cannot open file: $!\n";
my ($x,$y) = (0,0);
while (<FH>){
chomp;
@list = split /\s+/,$_;
foreach my $c (@list){
$worksheet->write($x, $y++, $c);
}
$x++;$y=0;
}
close(FH);
$workbook->close();

And then you could modify xls with Spreadsheet::ParseExcel package: look How can I modify an existing Excel workbook with Perl? and reading and writing sample [Editor's note: This link is broken and has been reported to IBM]

python

You could write real xls in python, follow a sample:

#!/usr/local/bin/python
# Tool to convert CSV files (with configurable delimiter and text wrap
# character) to Excel spreadsheets.
import string
import sys
import getopt
import re
import os
import os.path
import csv
from pyExcelerator import *

def usage():
""" Display the usage """
print "Usage:" + sys.argv[0] + " [OPTIONS] csvfile"
print "OPTIONS:"
print "--title|-t: If set, the first line is the title line"
print "--lines|-l n: Split output into files of n lines or less each"
print "--sep|-s c [def:,] : The character to use for field delimiter"
print "--output|o : output file name/pattern"
print "--help|h : print this information"
sys.exit(2)

def openExcelSheet(outputFileName):
""" Opens a reference to an Excel WorkBook and Worksheet objects """
workbook = Workbook()
worksheet = workbook.add_sheet("Sheet 1")
return workbook, worksheet

def writeExcelHeader(worksheet, titleCols):
""" Write the header line into the worksheet """
cno = 0
for titleCol in titleCols:
worksheet.write(0, cno, titleCol)
cno = cno + 1

def writeExcelRow(worksheet, lno, columns):
""" Write a non-header row into the worksheet """
cno = 0
for column in columns:
worksheet.write(lno, cno, column)
cno = cno + 1

def closeExcelSheet(workbook, outputFileName):
""" Saves the in-memory WorkBook object into the specified file """
workbook.save(outputFileName)

def getDefaultOutputFileName(inputFileName):
""" Returns the name of the default output file based on the value
of the input file. The default output file is always created in
the current working directory. This can be overriden using the
-o or --output option to explicitly specify an output file """
baseName = os.path.basename(inputFileName)
rootName = os.path.splitext(baseName)[0]
return string.join([rootName, "xls"], '.')

def renameOutputFile(outputFileName, fno):
""" Renames the output file name by appending the current file number
to it """
dirName, baseName = os.path.split(outputFileName)
rootName, extName = os.path.splitext(baseName)
backupFileBaseName = string.join([string.join([rootName, str(fno)], '-'), extName], '')
backupFileName = os.path.join(dirName, backupFileBaseName)
try:
os.rename(outputFileName, backupFileName)
except OSError:
print "Error renaming output file:", outputFileName, "to", backupFileName, "...aborting"
sys.exit(-1)

def validateOpts(opts):
""" Returns option values specified, or the default if none """
titlePresent = False
linesPerFile = -1
outputFileName = ""
sepChar = ","
for option, argval in opts:
if (option in ("-t", "--title")):
titlePresent = True
if (option in ("-l", "--lines")):
linesPerFile = int(argval)
if (option in ("-s", "--sep")):
sepChar = argval
if (option in ("-o", "--output")):
outputFileName = argval
if (option in ("-h", "--help")):
usage()
return titlePresent, linesPerFile, sepChar, outputFileName

def main():
""" This is how we are called """
try:
opts,args = getopt.getopt(sys.argv[1:], "tl:s:o:h", ["title", "lines=", "sep=", "output=", "help"])
except getopt.GetoptError:
usage()
if (len(args) != 1):
usage()
inputFileName = args[0]
try:
inputFile = open(inputFileName, 'r')
except IOError:
print "File not found:", inputFileName, "...aborting"
sys.exit(-1)
titlePresent, linesPerFile, sepChar, outputFileName = validateOpts(opts)
if (outputFileName == ""):
outputFileName = getDefaultOutputFileName(inputFileName)
workbook, worksheet = openExcelSheet(outputFileName)
fno = 0
lno = 0
titleCols = []
reader = csv.reader(inputFile, delimiter=sepChar)
for line in reader:
if (lno == 0 and titlePresent):
if (len(titleCols) == 0):
titleCols = line
writeExcelHeader(worksheet, titleCols)
else:
writeExcelRow(worksheet, lno, line)
lno = lno + 1
if (linesPerFile != -1 and lno >= linesPerFile):
closeExcelSheet(workbook, outputFileName)
renameOutputFile(outputFileName, fno)
fno = fno + 1
lno = 0
workbook, worksheet = openExcelSheet(outputFileName)
inputFile.close()
closeExcelSheet(workbook, outputFileName)
if (fno > 0):
renameOutputFile(outputFileName, fno)

if __name__ == "__main__":
main()

And then you could also convert to csv with this sourceforge project.
And if you could convert to csv, you could rewrite xls.. modifing the script.



Related Topics



Leave a reply



Submit