Data.Frame Without Ruining Column Names

data.frame without ruining column names

You can stop R changing the names to syntatically valid names by setting check.names = FALSE. See ?data.frame for details.

# assuming your data is in a list called my_list
do.call(data.frame, c(my_list, check.names = FALSE))

How to sort each row of a data frame WITHOUT losing the column names

Store the names and apply them:

nm = names(df)
sorted_df <- as.data.frame(t(apply(df, 1, sort)))
names(sorted_df) = nm

You could compress this down to a single line if you prefer:

sorted_df = setNames(as.data.frame(t(apply(df, 1, sort))), names(df))

Python - Pandas - Copy column names to new dataframe without bringing data

You could do it like this:

new_df = df.copy()
new_df[['5:10', '6:10', '7:10']] = ''

or more concise:

new_df = df.copy()
new_df[new_df.columns[1:]] = ''

But why not just create a new dataframe with new_df = df.copy() and then perform your computations without blanking the dataframe? I don't think you need to do that, and it just adds time to the process.

Removing the rows from dataframe till the actual column names are found

You can first get index of valid columns and then filter and set accordingly.

df = pd.read_csv("d.csv",sep='\s+', header=None)
col_index = df.index[(df == ["ID","Name","Year"]).all(1)].item()    # get columns index

df.columns = df.iloc[col_index].to_numpy() # set valid columns
df = df.iloc[col_index + 1 :] # filter data
df
ID Name Year
3 1 John Sophomore
4 2 Lisa Junior
5 3 Ed Senior

or

If you want to se ID as index

df = df.iloc[col_index + 1 :].set_index('ID')
df
Name Year
ID
1 John Sophomore
2 Lisa Junior
3 Ed Senior

How to append a list to dataframe without using column names?

You can do something like

df.loc[len(df)] = [1, 2, 3, 4]

Removing header column from pandas dataframe

I think you cant remove column names, only reset them by range with shape:

print df.shape[1]
2

print range(df.shape[1])
[0, 1]

df.columns = range(df.shape[1])
print df
0 1
0 23 12
1 21 44
2 98 21

This is same as using to_csv and read_csv:

print df.to_csv(header=None,index=False)
23,12
21,44
98,21

print pd.read_csv(io.StringIO(u""+df.to_csv(header=None,index=False)), header=None)
0 1
0 23 12
1 21 44
2 98 21

Next solution with skiprows:

print df.to_csv(index=False)
A,B
23,12
21,44
98,21

print pd.read_csv(io.StringIO(u""+df.to_csv(index=False)), header=None, skiprows=1)
0 1
0 23 12
1 21 44
2 98 21

subsetting data.frame without column names

What you want is a numeric vector instead of a data.frame. For this, you can just use as.numeric to do the conversion

> as.numeric(df[1,])
[1] 7.5 5.0 5.0 2.0 7.5 2.0 2.0 5.0

Pandas create empty DataFrame with only column names

You can create an empty DataFrame with either column names or an Index:

In [4]: import pandas as pd
In [5]: df = pd.DataFrame(columns=['A','B','C','D','E','F','G'])
In [6]: df
Out[6]:
Empty DataFrame
Columns: [A, B, C, D, E, F, G]
Index: []

Or

In [7]: df = pd.DataFrame(index=range(1,10))
In [8]: df
Out[8]:
Empty DataFrame
Columns: []
Index: [1, 2, 3, 4, 5, 6, 7, 8, 9]

Edit:
Even after your amendment with the .to_html, I can't reproduce. This:

df = pd.DataFrame(columns=['A','B','C','D','E','F','G'])
df.to_html('test.html')

Produces:

<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>A</th>
<th>B</th>
<th>C</th>
<th>D</th>
<th>E</th>
<th>F</th>
<th>G</th>
</tr>
</thead>
<tbody>
</tbody>
</table>

Renaming columns of a pandas dataframe without column names

If you want the index as the keys in your dict, you don't need to rename it.

df = pd.DataFrame.from_dict(dicts, orient = 'index') #index is name

df.columns = (['number']) #non-index column is number

df.index.name = 'name'

Or instead of changing the index name you can make a new column:

df = df.reset_index() #named column becomes index, index becomes ordered sequence

df['name'] = df['index'] #new column with names

del df['index'] #delete old column


Related Topics



Leave a reply



Submit