How to Convert Xls to Xlsx

How to convert xls to xlsx using R

In case someone else is also interested, I really didn't figure out how to do this in R but I did found this https://www.extendoffice.com/documents/excel/1349-excel-batch-convert-xls-to-xlsx.html#a2 and I can keep the exact same format from the original file.

Converting xls files to xlsx using python

use pandas

import pandas as pd
df = pd.read_excel("file.xls")
df.to_excel("file.xlsx")

Powershell - Convert .XLS file to .XLSX

The error message is clear. You forgot to close the try{..} block with an ending bracket } and a try{..} should be followed up by either one or more catch{..} blocks and optionally a finally{..} block.

You can read about that on about Try Catch Finally.

Then, there are some other things wrong and/or can be improved upon in your code as well.

  • $folderpath is not defined and should be the source folder $downloadfolder
  • use -Filter instead of -Include as it is much faster. Also you have left out the dot in '*.xls'
  • append switch -File to the Get-ChildItem cmdlet to make sure you will not receive and try to process directories as well
  • you can save the converted .xlsx files directly to the uploadfolder, no need to create first and then move
  • to remove the used COM objects, release them from memory first and then initiate the Garbage Collect.

    Do this after you have quit Excel.
# set folders
$downloadfolder = "C:\Users\Test" # folder where the .xls files are
$uploadfolder = "C:\Users\Test\Upload" # folder that uploads the .xlsx files
$backupfolder = "C:\Users\Test\Backup" # folder that has .xls files as backup

# open and convert xls to xlsx
Add-Type -AssemblyName Microsoft.Office.Interop.Excel
$xlFixedFormat = [Microsoft.Office.Interop.Excel.XlFileFormat]::xlOpenXMLWorkbook

$excel = New-Object -ComObject Excel.Application
$excel.Visible = $false # it is much faster if Excel is not visible

# loop through the .xls files and process them
Get-ChildItem -Path $downloadfolder -Filter '*.xls' -Recurse -File |
ForEach-Object {
try {
$xlsfilename = $_.FullName
#copy file to backup folder
Copy-Item -Path $xlsfilename -Destination $backupfolder -Force
# open the xls
Write-Host "Converting $xlsfilename"
$workbook = $excel.Workbooks.Open($xlsfilename)
# save converted file (as xlsx) directly to the upload folder
$newfilename = Join-Path -Path $uploadfolder -ChildPath ('{0}.xlsx' -f $_.BaseName)
$workbook.SaveAs($newfilename, $xlFixedFormat)
$workbook.Close()
#remove old file
Write-Host "Delete old file '$xlsfilename'"
Remove-Item -Path $xlsfilename -Force
}
catch {
# write out a warning as to why something went wrong
Write-Warning "Could not convert '$xlsfilename':`r`n$($_.Exception.Message)"
}
}
# close excel
$excel.Quit()
# garbage collection
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($excel)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()

Converting .xls to .xlsx files with VBA doesn't seem to reduce size

I seem to have figured it out.
For some reason if you don't reopen and close with saving changes a freshly converted file excel doesn't reduce files size. Maybe it compresses file in the background or something - not sure. Right now I just added a few lines opening and closing converted files with saving changes - works for me! Sometimes reducing size so drastically that it becomes 10MBs less

Convert xls to xlsx with excess headers

Answering the second part of my question. Still not certain on how to take in xls files.

If I convert the file to a CSV file, then use this command to remove the top few lines. skiprows is the method to use that cuts out the top section of a csv or xlsx file during the df's intialization.

df = pd.read_csv('file_name.csv', skiprows = 8)

How to convert xls to xlsx with Python Azure Webjob

On Azure WebApps for Windows, due to there is a secure environment called Azure Web App Sandbox which will restricts many operations include Win32k.sys (User32/GDI32) Restrictions. So you can not call any COM components via win32com that cause your issue.

In pure Python, there are many soultions to convert xls to xlsx, such as the common two as below.

  1. Install pyexcel via pip install pyexcel pyexcel-xls pyexcel-xlsx to do it.

    import pyexcel as p

    p.save_book_as(file_name='<your input file>.xls', dest_file_name='<your output file>.xlsx')
  2. Install pandas via pip install pandas to do it.

    import pandas as pd

    dataFrame = pd.read_excel('<your input file>.xls')
    dataFrame.to_excel('<your output file>.xlsx', index=False)

Note: the two solutions above just can convert these xls files of common data types which like string, number, not rich content (chart or image, etc.)

If you want to convert a xls file with rich content or without any format changes via win32com with Excel.Application, you have to use Azure Windows VM to run your script and integrate with your WebApp.



Related Topics



Leave a reply



Submit