Changing from Upper to Lower Case in Several Data Frames

Convert whole dataframe from lower case to upper case with Pandas

astype() will cast each series to the dtype object (string) and then call the str() method on the converted series to get the string literally and call the function upper() on it. Note that after this, the dtype of all columns changes to object.

In [17]: df
Out[17]:
regiment company deaths battles size
0 Nighthawks 1st kkk 5 l
1 Nighthawks 1st 52 42 ll
2 Nighthawks 2nd 25 2 l
3 Nighthawks 2nd 616 2 m

In [18]: df.apply(lambda x: x.astype(str).str.upper())
Out[18]:
regiment company deaths battles size
0 NIGHTHAWKS 1ST KKK 5 L
1 NIGHTHAWKS 1ST 52 42 LL
2 NIGHTHAWKS 2ND 25 2 L
3 NIGHTHAWKS 2ND 616 2 M

You can later convert the 'battles' column to numeric again, using to_numeric():

In [42]: df2 = df.apply(lambda x: x.astype(str).str.upper())

In [43]: df2['battles'] = pd.to_numeric(df2['battles'])

In [44]: df2
Out[44]:
regiment company deaths battles size
0 NIGHTHAWKS 1ST KKK 5 L
1 NIGHTHAWKS 1ST 52 42 LL
2 NIGHTHAWKS 2ND 25 2 L
3 NIGHTHAWKS 2ND 616 2 M

In [45]: df2.dtypes
Out[45]:
regiment object
company object
deaths object
battles int64
size object
dtype: object

Changing from upper to lower case in several data frames

Since you wish to keep all of your data frames in the global environment, this is a situation in which I would prefer using a for loop. This allows you to operate within the global environment (lapply requires that you return something to the global environment).

dfList <- c("df1", "df2", "df3", "df4")
for (i in dfList){
tmp <- get(i)
assign(i, setNames(tmp, tolower(names(tmp))))
}

How do I set column names to lower case for multiple dataframes?

The following should work:

dfList <- lapply(lapply(dfs,get),function(x) {colnames(x) <- tolower(colnames(x));x})

Problems like this generally stem from the fact that you haven't placed all your data frames in a single data structure, and then are forced to use something awkward, like get.

Not that in my code, I use lapply and get to actually create a single list of data frames first, and then alter their colnames.

You should also be aware that your lowercols function is rather un-R like. R functions generally aren't called in such a way that they return nothing, but have side effects. If you try to write functions that way (which is possible) you will probably make your life difficult and have scoping issues. Note that in my second lapply I explicitly return the modified data frame.

Convert from lowercase to uppercase all values in all character variables in dataframe

Starting with the following sample data :

df <- data.frame(v1=letters[1:5],v2=1:5,v3=letters[10:14],stringsAsFactors=FALSE)

v1 v2 v3
1 a 1 j
2 b 2 k
3 c 3 l
4 d 4 m
5 e 5 n

You can use :

data.frame(lapply(df, function(v) {
if (is.character(v)) return(toupper(v))
else return(v)
}))

Which gives :

  v1 v2 v3
1 A 1 J
2 B 2 K
3 C 3 L
4 D 4 M
5 E 5 N

How can I make pandas dataframe column headers all lowercase?

You can do it like this:

data.columns = map(str.lower, data.columns)

or

data.columns = [x.lower() for x in data.columns]

example:

>>> data = pd.DataFrame({'A':range(3), 'B':range(3,0,-1), 'C':list('abc')})
>>> data
A B C
0 0 3 a
1 1 2 b
2 2 1 c
>>> data.columns = map(str.lower, data.columns)
>>> data
a b c
0 0 3 a
1 1 2 b
2 2 1 c

Convert column values to lower case only if they are string

The test in your lambda function isn't quite right, you weren't far from the truth though:

df.apply(lambda x: x.str.lower() if(x.dtype == 'object') else x)

With the data frame and output:

>>> df = pd.DataFrame(
[
{'OS': 'Microsoft Windows', 'Count': 3},
{'OS': 'Mac OS X', 'Count': 4},
{'OS': 'Linux', 'Count': 234},
{'OS': 'Dont have a preference', 'Count': 0},
{'OS': 'I prefer Windows and Unix', 'Count': 3},
{'OS': 'Unix', 'Count': 2},
{'OS': 'VMS', 'Count': 1},
{'OS': 'DOS or ZX Spectrum', 'Count': 2},
]
)
>>> df = df.apply(lambda x: x.str.lower() if x.dtype=='object' else x)
>>> print(df)
OS Count
0 microsoft windows 3
1 mac os x 4
2 linux 234
3 dont have a preference 0
4 i prefer windows and unix 3
5 unix 2
6 vms 1
7 dos or zx spectrum 2

How to lowercase a pandas dataframe string column if it has missing values?

use pandas vectorized string methods; as in the documentation:

these methods exclude missing/NA values automatically

.str.lower() is the very first example there;

>>> df['x'].str.lower()
0 one
1 two
2 NaN
Name: x, dtype: object


Related Topics



Leave a reply



Submit