How to remove Empty Cell from data frame row wise
Thanks to @Perennial for his suggestions above. Finally, I did it as follows.
new_lines = []
with open('data.csv', 'r') as csv:
# skip the first line
csv.readline()
for line in csv.readlines():
words = line.strip().split(',')
new_words = [w for w in words if w and w.strip()]
#skip the empty lines
if len(new_words) != 0:
new_lines.append(','.join(new_words))
df = pd.DataFrame(new_lines)
df.to_csv('results.csv', sep=',')
The @Scott's solution is elegant but I don't know, it always throws a memoryError exception.
One more thing, I do not want the row numbers in the resultant file. If anyone helps me out. Although, I delete that column using Excel :)
How do I completely delete a row with an empty cell in R?
@Rui Barradas, Your suggestion solved the problem.
textparcali<-textparcali[textparcali$word !="",]
The same problem already existed as another issue. But they offered a different solution, and it worked.
Link is here
Python Pandas remove empty cells in dataframe
There is MultiIndex
, so is necessary flatten columns names by:
a = pd.concat([ask, bid], axis=1, keys=['RateAsk', 'RateBid'])
a.columns = a.columns.map('_'.join)
Then use boolean indexing
with filtering all non empty and not NaN rows by column RateAsk_open
:
a = a[(a['RateAsk_open'] != '') | (a['RateAsk_open'].notnull()]
But if want to drop the rows where all elements are missing:
a = a.dropna(how='all')
Pandas - How to exclude blank cells when returning a row
You can chain a filter on the row:
print(safety.loc[safety['Drug_name'] == searchterm][lambda x: x != ''])
Or if you only need to drop NA, use dropna
:
print(safety.loc[safety['Drug_name'] == searchterm].dropna())
Use a chained filter you can also remove both empty string and NA:
print(safety.loc[safety['Drug_name'] == searchterm][lambda x: (x != '') & x.notnull()])
Move non-empty cells to the left in pandas DataFrame
Here's what I did:
I unstacked your dataframe into a longer format, then grouped by the name column. Within each group, I drop the NaNs, but then reindex to the full h1 thought h4 set, thus re-creating your NaNs to the right.
from io import StringIO
import pandas
def defragment(x):
values = x.dropna().values
return pandas.Series(values, index=df.columns[:len(values)])
datastring = StringIO("""\
Name h1 h2 h3 h4
A 1 nan 2 3
B nan nan 1 3
C 1 3 2 nan""")
df = pandas.read_table(datastring, sep='\s+').set_index('Name')
long_index = pandas.MultiIndex.from_product([df.index, df.columns])
print(
df.stack()
.groupby(level='Name')
.apply(defragment)
.reindex(long_index)
.unstack()
)
And so I get:
h1 h2 h3 h4
A 1 2 3 NaN
B 1 3 NaN NaN
C 1 3 2 NaN
How to replace empty cells with 0 and change strings to integers where possible in a pandas dataframe?
you are not saving your change in your function:
def recode_empty_cells(dataframe, list_of_columns):
for column in list_of_columns:
dataframe[column] = dataframe[column].replace(r'\s+', np.nan, regex=True)
dataframe[column] = dataframe[column].fillna(0)
return dataframe
Remove Multiple Empty Columns for String
One option using base R apply
is to first calculate number of columns which are going to be present in the final dataframe (cols
). Filter empty values from each row and insert empty values using rep
.
cols <- max(rowSums(df != ""))
as.data.frame(t(apply(df, 1, function(x) {
vals <- x[x != ""]
c(vals, rep("", cols - length(vals)))
})))
# V1 V2 V3
#1 aaa ccc
#2 aaa bbb
#3 bbb ccc ddd
Another option with gather
/spread
would be to add a new column for row number convert it to long format using gather
, filter
the non-empty values, group_by
every row
and give new column names using paste0
and finally convert it to wide format using spread
.
library(dplyr)
library(tidyr)
df %>%
mutate(row = row_number()) %>%
gather(key, value, -row) %>%
filter(value != "") %>%
group_by(row) %>%
mutate(key = paste0("new", row_number())) %>%
spread(key, value, fill = "") %>%
ungroup() %>%
select(-row)
# new1 new2 new3
# <chr> <chr> <chr>
#1 aaa ccc ""
#2 aaa bbb ""
#3 bbb ccc ddd
Find empty cells in rows of a column | Dataframe pandas
If empty is missing value or None
s use Series.isna
:
cond = df['Teacher'].isna()
If empty is zero or more spaces use Series.str.contains
:
cond = df['Teacher'].str.contains(r'^\s*$', na=False)
If empty is empty string compare by it:
cond = df['Teacher'] == ''
df = pd.DataFrame({'Teacher':['',' ', None, np.nan, 'Richard']})
cond1 = df['Teacher'].isna()
cond2 = df['Teacher'].str.contains(r'^\s*$', na=False)
cond3 = df['Teacher'] == ''
df = df.assign(cond1= cond1, cond2= cond2, cond3= cond3)
print (df)
Teacher cond1 cond2 cond3
0 False True True
1 False True False
2 None True False False
3 NaN True False False
4 Richard False False False
Related Topics
Python - How to Pad the Output of a MySQL Table
How to Read Numbers from File in Python
How to Name a File by a Variable Name in Python
Making a Dictionary from Each Line in a File
Sub Totals and Grand Totals in Python
How to Retrieve SQL Result Column Value Using Column Name in Python
How to Select Last Row and Also How to Access Pyspark Dataframe by Index
How to Split Image into Multiple Pieces in Python
How to Iterate Through a List of Dictionaries in Jinja Template
Print Floating Point Values Without Leading Zero
Passing a List of Values from Python to the in Clause of an SQL Query
Pandas Merge - How to Avoid Duplicating Columns
Webdriverexception: Message: Unknown Error: Chrome Failed to Start: Crashed
Creating New Dataframes in Loop in Python
Find Value in Dictionary Using Regex in Python
How to Loop Over Multiple Dataframes and Produce Multiple List