To Find Whether a Column Exists in Data Frame or Not

How to check if a column exists in Pandas

This will work:

if 'A' in df:

But for clarity, I'd probably write it as:

if 'A' in df.columns:

To find whether a column exists in data frame or not

Assuming that the name of your data frame is dat and that your column name to check is "d", you can use the %in% operator:

if("d" %in% colnames(dat))
{
cat("Yep, it's in there!\n");
}

To check if few values in dataframe column exists in another dataframe column

You can compare columns:

print(df1['one'].isin(df2['one']))  
0 True
1 True
2 True
3 True
Name: one, dtype: bool

Or convert values of DataFrame to 1d array and then list:

print(df1.isin(df2.to_numpy().ravel().tolist()))  
one
0 True
1 True
2 True
3 True

Pandas: Check if column exists in df from a list of columns

Here is how I would approach:

import numpy as np

for col in column_list:
if col not in df.columns:
df[col] = np.nan

How to check if a column exists in a matrix or data frame?

I hope you realise that columns "Gender" or "Age" either do or don't exist for all rows in the data frame?

An easy way to check is to take the names of the data frame and compare the columns you are inetrested with the names to see if they are included in that set. For example, some data as per your question:

df <- data.frame(Name = "Ben", Age = 12, Address = "CA", ContactNo = 1234567)

Note the names attribute for the data frame df:

names(df)

> names(df)
[1] "Name" "Age" "Address" "ContactNo"

Then you can check to see if the variables of interest are in the set of variables in the data frame:

c("Gender", "Age") %in% names(df)

> c("Gender", "Age") %in% names(df)
[1] FALSE TRUE

For a matrix, you need the colnames attribute, accessed via the colnames() extractor function, instead of the names attribute and names().

Check if a column exists and if not add it

You can use this dummy data df and colToAdd columns to check if not exists to add

df <- data.frame(A = rnorm(5) , B = rnorm(5) , C = rnorm(5))

colToAdd <- c("B" , "D")

then apply the check if the column exists NULL produced else add your column e.g. rnorm(5)

add <- sapply(colToAdd , \(x) if(!(x %in% colnames(df))) rnorm(5))

data.frame(do.call(cbind , c(df , add)))

  • output
           A          B          C          D
1 1.5681665 -0.1767517 0.6658019 -0.8477818
2 -0.5814281 -1.0720196 0.5343765 -0.8259426
3 -0.5649507 -1.1552189 -0.8525945 1.0447395
4 1.2024881 -0.6584889 -0.1551638 0.5726059
5 0.7927576 0.5340098 -0.5139548 -0.7805733

find whether a column exists in R data frame and if not then create it

If you have a vector of names that you know should be in it, the following will check if they already have a column. If not, they'll create one with value 0.

x <- c( "qc1","qc2","itv1","itv2", "no" )
d <- data.frame( no = 123, qc6 = 12, qc5 = 12, qc3 = 14, itv6 = 8, itv5 = 9, itv3 = 9)

d[x[!(x %in% colnames(d))]] = 0
d

This gives the output:

 no     qc6    qc5    qc3     itv6   itv5    itv3    qc1    qc2    itv1    itv2
123 12 12 14 8 9 9 0 0 0 0

How to see if column name exists in a dataframe, and if not create the column with a default value?

Check if the column name is in the columns list. If not make it 10:

if("Col_4" in df.columns):
print("Col_4 exists")
else:
df["Col_4"] = 10


Related Topics



Leave a reply



Submit