Read Column Names as Date Format

Looking to format column names to dates after reading in a CSV

Using as.yearmon you can try :

names(df) <- zoo::as.yearmon(names(df), 'X%Y.%m')

Or in base R pasting an arbitrary date :

names(df) <- format(as.Date(paste0(names(df), '.01'), 'X%Y.%m.%d'), '%b-%Y')

As an example :

x <- c('X2017.04', 'X2017.05', 'X2017.06')
format(as.Date(paste0(x, '.01'), 'X%Y.%m.%d'), '%b-%Y')
#[1] "Apr-2017" "May-2017" "Jun-2017"

How to convert data frame headers from excel values to date format

Here is a literal answer to your question, but I don't recommend it.

Table1 <- data.frame("43220"=c(1,7), "43251"=c(2,8), check.names = FALSE)
Table1
#   43220 43251
# 1     1     2
# 2     7     8
as.Date(as.integer(names(Table1)), origin = "1899-12-30")
# [1] "2018-04-30" "2018-05-31"
format(as.Date(as.integer(names(Table1)), origin = "1899-12-30"), format="%b %y")
# [1] "Apr 18" "May 18"
names(Table1) <- format(as.Date(as.integer(names(Table1)), origin = "1899-12-30"), format="%b %y")
Table1
#   Apr 18 May 18
# 1      1      2
# 2      7      8

In general I discourage the presence of "data-like things" as column names, unless being shown in a report or rendering of the data. It can often be easier (but not always) dealing with data in a "long" format, where the date is a value of a column, not a column name:

Table1 <- data.frame("43220"=c(1,7), "43251"=c(2,8), check.names = FALSE)
Table1long <- tidyr::pivot_longer(Table1, everything(), names_to = "date")
Table1long
# # A tibble: 4 x 2
#   date  value
#   <chr> <dbl>
# 1 43220     1
# 2 43251     2
# 3 43220     7
# 4 43251     8
Table1long$date <- as.Date(as.integer(Table1long$date), origin = "1899-12-30")
Table1long
# # A tibble: 4 x 2
#   date       value
#   <date>     <dbl>
# 1 2018-04-30     1
# 2 2018-05-31     2
# 3 2018-04-30     7
# 4 2018-05-31     8

where the date column is now actually number-like values, where things like addition and differencing work naturally. Once you are done munging/calculating on the data and are ready for reports, then you can format it with format(., format="%b %y") or such.

(An alternative to tidyr is reshape2:

reshape2::melt(Table1, -seq_len(ncol(Table1)), variable.name = "date")
#    date value
# 1 43220     1
# 2 43220     7
# 3 43251     2
# 4 43251     8

Replace column headers date, into year and month, without changing remaining column headers

a clean way to do this would be to check the datatype of the columns Index element and format to the desired string if it is a timestamp.

Ex:

import pandas as pd

# simplified example for dataframe columns
cols = [              'Key',    'COUNTRY',            'MARKET',
                 'A1',               'CAB',           'type',
   pd.Timestamp("2021-06-01 00:00:00"), pd.Timestamp("2021-07-01 00:00:00")]

cols_new = [k.strftime("%Y_%B_%d") if isinstance(k, pd.Timestamp) else k for k in cols]

print(cols_new)
# ['Key', 'COUNTRY', 'MARKET', 'A1', 'CAB', 'type', '2021_June_01', '2021_July_01']

# update in one line:
# df.columns = [k.strftime("%Y_%B_%d") if isinstance(k, pd.Timestamp) else k for k in df.columns]

Change dataframe column names from string format to datetime

If select by loc columns values was not changed, so get KeyError.

So you need assign output to columns:

df.columns = pd.to_datetime(df.columns)

Sample:

cols = ['2000-01-01', '2000-02-01', '2000-03-01', '2000-04-01', '2000-05-01']
vals = np.arange(5)
df = pd.DataFrame(columns = cols, data=[vals])
print (df)
   2000-01-01  2000-02-01  2000-03-01  2000-04-01  2000-05-01
0           0           1           2           3           4

print (df.columns)
Index(['2000-01-01', '2000-02-01', '2000-03-01', '2000-04-01', '2000-05-01'], dtype='object')

df.columns = pd.to_datetime(df.columns)

print (df.columns)
DatetimeIndex(['2000-01-01', '2000-02-01', '2000-03-01', '2000-04-01',
               '2000-05-01'],
              dtype='datetime64[ns]', freq=None)

Also is possible convert to period:

print (df.columns)
Index(['2000-01-01', '2000-02-01', '2000-03-01', '2000-04-01', '2000-05-01'], dtype='object')

df.columns = pd.to_datetime(df.columns).to_period('M')

print (df.columns)
PeriodIndex(['2000-01', '2000-02', '2000-03', '2000-04', '2000-05'],
             dtype='period[M]', freq='M')

Read Column Names as Date Format

Looking to format column names to dates after reading in a CSV

How to convert data frame headers from excel values to date format

Replace column headers date, into year and month, without changing remaining column headers

Change dataframe column names from string format to datetime

Related Topics

Leave a reply