Delete from csv values and change column names when writing to a CSV
See manual write.table {utils}.
help(write.csv)
write.csv(X, quote = FALSE)
The justification for quoting the fields by default is that unquoted fields containing commas will be misinterpreted.
How do you remove the column name row when exporting a pandas DataFrame?
You can write to csv without the header using header=False
and without the index using index=False
. If desired, you also can modify the separator using sep
.
CSV example with no header row, omitting the header row:
df.to_csv('filename.csv', header=False)
TSV (tab-separated) example, omitting the index column:
df.to_csv('filename.tsv', sep='\t', index=False)
How to delete a particular column in csv file without pandas library
In this case, the csv.DictReader
and csv.DictWriter
classes are very handy:
import csv
with open("input.csv") as instream, open("output.csv", "w") as outstream:
# Setup the input
reader = csv.DictReader(instream)
# Setup the output fields
output_fields = reader.fieldnames
output_fields.remove("Department")
output_fields.remove("Allocation")
# Setup the output
writer = csv.DictWriter(
outstream,
fieldnames=output_fields,
extrasaction="ignore", # Ignore extra dictionary keys/values
)
# Write to the output
writer.writeheader()
writer.writerows(reader)
Notes
For input, each of the row will be a dictionary such as
{'Name': 'Birla', 'Age': '49', 'YearofService': '12', 'Department': 'Welding', 'Allocation': 'Production'}
For output, we remove those columns (fields) that we don't need, see
output_fields
The
extraaction
parameter tellsDictReader
to ignore extra keys/values from the dictionaries
Update
In order to remove columns from a CSV file we need to
- Open the input file, reader all the rows, close it
- Open it again to write.
Here is the code, which I modified from the above
import csv
with open("input.csv") as instream:
# Setup the input
reader = csv.DictReader(instream)
rows = list(reader)
# Setup the output fields
output_fields = reader.fieldnames
output_fields.remove("Department")
output_fields.remove("Allocation")
with open("input.csv", "w") as outstream:
# Setup the output
writer = csv.DictWriter(
outstream,
fieldnames=output_fields,
extrasaction="ignore", # Ignore extra dictionary keys/values
)
# Write to the output
writer.writeheader()
writer.writerows(rows)
Import csv: remove filename from column names in first row
Try by replacing this:
rows[0].replace((file.replace("099_2019_01_01_","")).replace(".csv","")+"-","")
By this in your code:
x=file.replace('099_2019_01_01_','').replace('.csv', '')
rows[0]=[i.replace(x+'-', '') for i in rows[0]]
How do you delete a column of values in a csv file but not the first item?
Updating the file is going to involve rewriting the whole thing. The code below shows one way of accomplishing this which involves initially writing all the changes into a separate temporary file, and then replacing the original file with it after all the changes have been written to the temporary one.
You can only avoid writing a separate file by reading the entire file into memory, making the changes, and then overwriting the original file with them.
To avoid deleting the column from the header row, it's simply handled separately at the very beginning. The code below illustrates how to do everything:
import csv
import os
from pathlib import Path
from tempfile import NamedTemporaryFile
filepath = Path('file.csv')
with open(filepath, 'r', newline='') as csv_file, \
NamedTemporaryFile('w', newline='', dir=filepath.parent,
delete=False) as tmp_file:
csv_reader = csv.reader(csv_file)
csv_writer = csv.writer(tmp_file)
# First copy the header.
header = next(csv_reader)
csv_writer.writerow(header)
# Copy rows of data leaving out first column.
for row in csv_reader:
csv_writer.writerow(row[1:])
# Replace original file with updated version.
os.replace(tmp_file.name, filepath)
print('fini')
How to drop a specific column of csv file while reading it using pandas?
If you know the column names prior, you can do it by setting usecols
parameter
When you know which columns to use
Suppose you have csv file with columns ['id','name','last_name']
and you want just ['name','last_name']
. You can do it as below:
import pandas as pd
df = pd.read_csv("sample.csv", usecols = ['name','last_name'])
when you want first N columns
If you don't know the column names but you want first N columns from dataframe. You can do it by
import pandas as pd
df = pd.read_csv("sample.csv", usecols = [i for i in range(n)])
Edit
When you know name of the column to be dropped
# Read column names from file
cols = list(pd.read_csv("sample_data.csv", nrows =1))
print(cols)
# Use list comprehension to remove the unwanted column in **usecol**
df= pd.read_csv("sample_data.csv", usecols =[i for i in cols if i != 'name'])
Replacing and deleting columns from a csv using python
@anuj
I think SafeDev's solution is optimal but if you don't want to go with pandas, just make little changes in your code.
for row in reader:
if row:
if row[7] in delete:
continue
elif row[7] in replace:
key = row[7]
row[7] = replace[key][0]
row[10]= replace[key][1]
result.append(row)
else:
result.append(row)
Hope this solves your issue.
Java- CSV / Delete column in csv file
The only way to delete a column in a CSV file is to remove the header and the information of this column in the whole file, that is for each row of the file. Even if you use a third party library it will do this internally.
Related Topics
How to Group My Date Variable into Month/Year in R
How to Format Axis Labels with Exponents with Ggplot2 and Scales
How to Increase the Number of Columns Using R in Linux
Change Both Legend Titles in a Ggplot with Two Legends
Align Multiple Tables Side by Side
How to Convert Data.Frame to Transactions for Arules
How to Separate Two Plots in R
Data Table Merge Based on Date Ranges
Adding Greek Character to Axis Title
Make Frequency Histogram for Factor Variables
Euclidean Distance of Two Vectors
Adding Labels to Ggplot Bar Chart
How to Edit and Debug R Library Sources
Generate Markdown Comments Within for Loop