Reading Only Specific Columns from a CSV File Out of Many

Reading only specific columns from a CSV file out of many

The header is still to skip in this code, but with this code, you can choose which columns to extract. It might be much faster if you use a StreamReader.
And you will need a constructor for your object.

var temp = File.ReadAllLines(@"C:\myFile.csv");
public List<MyMappedCSVFile>() myExtraction = new List<MyMappedCSVFile>();
foreach(string line in temp)
{
var delimitedLine = line.Split('\t'); //set ur separator, in this case tab

myExtraction.Add(new MyMappedCSVFile(delimitedLine[0], delimitedLine[3]));
}

Code for your Object:

public class MyMappedCSVFile
{
public string ProfileID { get; set; }
public string Date { get; set; }

public MyMappedCSVFile(string profile, string date)
{
ProfileID = profile;
Date = date;
}
}

Read specific columns from a csv file with csv module?

The only way you would be getting the last column from this code is if you don't include your print statement in your for loop.

This is most likely the end of your code:

for row in reader:
content = list(row[i] for i in included_cols)
print content

You want it to be this:

for row in reader:
content = list(row[i] for i in included_cols)
print content

Now that we have covered your mistake, I would like to take this time to introduce you to the pandas module.

Pandas is spectacular for dealing with csv files, and the following code would be all you need to read a csv and save an entire column into a variable:

import pandas as pd
df = pd.read_csv(csv_file)
saved_column = df.column_name #you can also use df['column_name']

so if you wanted to save all of the info in your column Names into a variable, this is all you need to do:

names = df.Names

It's a great module and I suggest you look into it. If for some reason your print statement was in for loop and it was still only printing out the last column, which shouldn't happen, but let me know if my assumption was wrong. Your posted code has a lot of indentation errors so it was hard to know what was supposed to be where. Hope this was helpful!

How to select specific columns from read_csv which start with specific word?

You read the file twice: once for the headers only and once for the actual data:

df = pd.read_csv('data.csv', usecols=lambda col: col.startswith('A_') or col.startswith('X_'))

Pull out specific columns from multiple CSV files in a directory in Python

The below works:

import pandas as pd
import glob

dfOut = []

for myfile in glob.glob("*.csv"):
tmp = pd.read_csv(myfile, encoding='latin-1')

matching = [s for s in tmp.columns if "would recommend" in s]
if len(matching) > 0:
tmp.rename(columns={matching[0]: 'Recommend'}, inplace=True)
tmp = tmp[['Subunit', 'Recommend']]
dfOut.append(tmp)

df = pd.concat(dfOut)


Related Topics



Leave a reply



Submit