How to check if a column exists in Pandas
This will work:
if 'A' in df:
But for clarity, I'd probably write it as:
if 'A' in df.columns:
To find whether a column exists in data frame or not
Assuming that the name of your data frame is dat
and that your column name to check is "d"
, you can use the %in%
operator:
if("d" %in% colnames(dat))
{
cat("Yep, it's in there!\n");
}
To check if few values in dataframe column exists in another dataframe column
You can compare columns:
print(df1['one'].isin(df2['one']))
0 True
1 True
2 True
3 True
Name: one, dtype: bool
Or convert values of DataFrame to 1d array and then list:
print(df1.isin(df2.to_numpy().ravel().tolist()))
one
0 True
1 True
2 True
3 True
Pandas: Check if column exists in df from a list of columns
Here is how I would approach:
import numpy as np
for col in column_list:
if col not in df.columns:
df[col] = np.nan
How to check if a column exists in a matrix or data frame?
I hope you realise that columns "Gender"
or "Age"
either do or don't exist for all rows in the data frame?
An easy way to check is to take the names of the data frame and compare the columns you are inetrested with the names to see if they are included in that set. For example, some data as per your question:
df <- data.frame(Name = "Ben", Age = 12, Address = "CA", ContactNo = 1234567)
Note the names
attribute for the data frame df
:
names(df)
> names(df)
[1] "Name" "Age" "Address" "ContactNo"
Then you can check to see if the variables of interest are in the set of variables in the data frame:
c("Gender", "Age") %in% names(df)
> c("Gender", "Age") %in% names(df)
[1] FALSE TRUE
For a matrix, you need the colnames
attribute, accessed via the colnames()
extractor function, instead of the names
attribute and names()
.
Check if a column exists and if not add it
You can use this dummy data df
and colToAdd
columns to check if not exists to add
df <- data.frame(A = rnorm(5) , B = rnorm(5) , C = rnorm(5))
colToAdd <- c("B" , "D")
then apply the check if the column exists NULL
produced else add your column e.g. rnorm(5)
add <- sapply(colToAdd , \(x) if(!(x %in% colnames(df))) rnorm(5))
data.frame(do.call(cbind , c(df , add)))
- output
A B C D
1 1.5681665 -0.1767517 0.6658019 -0.8477818
2 -0.5814281 -1.0720196 0.5343765 -0.8259426
3 -0.5649507 -1.1552189 -0.8525945 1.0447395
4 1.2024881 -0.6584889 -0.1551638 0.5726059
5 0.7927576 0.5340098 -0.5139548 -0.7805733
find whether a column exists in R data frame and if not then create it
If you have a vector of names that you know should be in it, the following will check if they already have a column. If not, they'll create one with value 0.
x <- c( "qc1","qc2","itv1","itv2", "no" )
d <- data.frame( no = 123, qc6 = 12, qc5 = 12, qc3 = 14, itv6 = 8, itv5 = 9, itv3 = 9)
d[x[!(x %in% colnames(d))]] = 0
d
This gives the output:
no qc6 qc5 qc3 itv6 itv5 itv3 qc1 qc2 itv1 itv2
123 12 12 14 8 9 9 0 0 0 0
How to see if column name exists in a dataframe, and if not create the column with a default value?
Check if the column name is in the columns list. If not make it 10:
if("Col_4" in df.columns):
print("Col_4 exists")
else:
df["Col_4"] = 10
Related Topics
How to Remove Rows with All Zeros Without Using Rowsums in R
Confidence Intervals for Predictions from Logistic Regression
Reordering Columns in a Large Dataframe
Comparison Between Dplyr::Do/Purrr::Map, What Advantages
Save All Plots Already Present in the Panel of Rstudio
What's the Difference Between Hex Code (\X) and Unicode (\U) Chars
Ggplot2: Define Plot Layout with Grid.Arrange() as Argument of Do.Call()
Remove the Last Element of a Vector
How to Use 'Facet' to Create Multiple Density Plot in Ggplot
R: Expand and Fill Data Frame by Date in Series
Remove All Variables Except Functions
How to Combine Row and Column Layout in Flexdashboard
Piecewise Regression with R: Plotting the Segments
How to Separate Title Page and Table of Content Page from Knitr Rmarkdown PDF