Python parse CSV ignoring comma with double-quotes
This should do:
lines = '''"AAA", "BBB", "Test, Test", "CCC"
"111", "222, 333", "XXX", "YYY, ZZZ"'''.splitlines()
for l in csv.reader(lines, quotechar='"', delimiter=',',
quoting=csv.QUOTE_ALL, skipinitialspace=True):
print l
>>> ['AAA', 'BBB', 'Test, Test', 'CCC']
>>> ['111', '222, 333', 'XXX', 'YYY, ZZZ']
csv ignore comma inside double quote
Just add an escape character to deal with escaped quotes in the csv
csv.reader(f, doublequote=True, quoting=csv.QUOTE_ALL, escapechar='\\')
How to read CSV file ignoring commas between quotes with Pandas
You might try
data = pd.read_csv('testfile.csv', sep=',', quotechar='"',
skipinitialspace=True, encoding='utf-8')
which tells pandas to ignore the space that comes after the comma, otherwise it can't recognize the quote.
EDIT: Apparently this does not work for the author of the question
Therefore, this is a script that produces the wanted result.
I have python 3.8.9, pandas 1.2.3.
itworks.py
import pandas as pd
with open("testfile.csv", "w") as f:
f.write("""column1,column2,column3
a, b, c
a, c, "c, d"
""")
data = pd.read_csv("testfile.csv", sep=",", quotechar='"', skipinitialspace=True, encoding="utf-8")
print(data)
$ python itworks.py
column1 column2 column3
0 a b c
1 a c c, d
$
Try to reproduce this minimal example.
Parse csv with quotes and commas
I found the solution on another post. All I gotta do is add 2 attributes to read_csv
: pd.read_csv('dataset.csv', escapechar='\\', encoding='utf-8')
. It's working fine now.
CSV file has commas in data, which python interprets as extra columns
From the python doc
>>> import csv
>>> with open('eggs.csv', 'rb') as csvfile:
... spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
... for row in spamreader:
... print ', '.join(row)
Spam, Spam, Spam, Spam, Spam, Baked Beans
Spam, Lovely Spam, Wonderful Spam
So if you want to convert your file into a list of lists:
import csv
myFileAsArray = []
with open('eggs.csv', 'rb') as csvfile:
spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
for row in spamreader:
myFileAsArray.append(row)
Related Topics
Calling the "Source" Command from Subprocess.Popen
How to Rotate the Sprite and Shoot the Bullets Towards the Mouse Position
How to Style Gtkbox Margin/Padding with CSS Only
Error When Installing Rpy2 Module in Python with Easy_Install
Python Equivalent of Ruby's .Select
How to Increment Datetime by Custom Months in Python Without Using Library
How to Return a Value from _Init_ in Python
Splitting a String into Words and Punctuation
How to Change the Name of a Django App
Python Pandas: Convert Rows as Column Headers
In Python, How to Import Filename Starts with a Number
Pyinstaller Unable to Access Data Folder
How to Use Tailwindcss with Django
Install Rpy2 on Windows7 64Bit for Python 2.7
How to Print Variable and String on Same Line in Python
Handling Backreferences to Capturing Groups in Re.Sub Replacement Pattern