Combine a folder of text files into a CSV with each content in a cell
Can be written slightly more compactly using pathlib.
>>> import os
>>> os.chdir('c:/scratch/folder to process')
>>> from pathlib import Path
>>> with open('big.csv', 'w') as out_file:
... csv_out = csv.writer(out_file)
... csv_out.writerow(['FileName', 'Content'])
... for fileName in Path('.').glob('*.txt'):
... csv_out.writerow([str(fileName),open(str(fileName.absolute())).read().strip()])
The items yielded by this glob provide access to both the full pathname and the filename, hence no need for concatenations.
EDIT: I've examined one of the text files and found that one of the characters that chokes processing looks like 'fi' but is actually these two characters together as a single character. Given the likely practical use to which this csv will be put I suggest the following processing, which ignores weird characters like that one. I strip out endlines because I suspect this makes csv processing more complicated, and a possible topic for another question.
import csv
from pathlib import Path
with open('big.csv', 'w', encoding='Latin-1') as out_file:
csv_out = csv.writer(out_file)
csv_out.writerow(['FileName', 'Content'])
for fileName in Path('.').glob('*.txt'):
lines = [ ]
with open(str(fileName.absolute()),'rb') as one_text:
for line in one_text.readlines():
lines.append(line.decode(encoding='Latin-1',errors='ignore').strip())
csv_out.writerow([str(fileName),' '.join(lines)])
Multiple specific text files into CSV in python
As a general advice: The pandas library is pretty useful for things like this. If I understood your problem correctly, this should basically do it:
import os
import pandas as pd
dirpath = 'C:\Users\gputman\Desktop\Control_File_Tracker\Input\\'
output = 'C:\Users\gputman\Desktop\Control_File_Tracker\Output\New Microsoft Excel Worksheet.csv'
csvout = pd.DataFrame()
for filename in files:
data = pd.read_csv(filename, sep=':', index_col=0, header=None).T
csvout = csvout.append(data)
csvout.to_csv(output)
For explanation of the code, see this question/answer which explains how to read a transposed text file with pandas.
PowerShell: How to upload data from multiple txt files into a single xlsx or csv file
If you're certain the text files all have the same format, you can treat them as tab-delimited csv files, import them and save out merged like below:
(Get-ChildItem -Path 'X:\Somewhere' -Filter '*.txt' -File).FullName |
Import-Csv -Delimiter "`t" |
Export-Csv 'X:\SomewhereElse\merged.csv' -UseCulture -NoTypeInformation
Using switch -UseCulture
means the merged csv is written out using the delimiter your local Excel expects, so when done just double-click the file to open in Excel.
Related Topics
Does Python Have a Module to Convert CSS Styles to Inline Styles for Emails
Set Background Color for Subplot
Bloomberg Server API and Ruby/Python
If Monkey Patching Is Permitted in Both Ruby and Python, Why Is It More Controversial in Ruby
How to Perform Element-Wise Multiplication of Two Lists
Multiprocessing Example Giving Attributeerror
Pandas Column Access W/Column Names Containing Spaces
How to Make Abstract Classes in Python
Writing String to a File on a New Line Every Time
Control the Size Textarea Widget Look in Django Admin
R Foverlaps Equivalent in Python
What's the Ruby Equivalent of Python's Os.Walk
Looking for Recommendation on How to Convert PDF into Structured Format
How to Pickle a Python Function (Or Otherwise Serialize Its Code)
Translate Every Element in Numpy Array According to Key
How to Map Numeric Data into Categories/Bins in Pandas Dataframe