How to Convert a Pandas Dataframe to a Pytorch Tensor

How do I convert a Pandas dataframe to a PyTorch tensor?

I'm referring to the question in the title as you haven't really specified anything else in the text, so just converting the DataFrame into a PyTorch tensor.

Without information about your data, I'm just taking float values as example targets here.

Convert Pandas dataframe to PyTorch tensor?

import pandas as pd
import torch
import random

# creating dummy targets (float values)
targets_data = [random.random() for i in range(10)]

# creating DataFrame from targets_data
targets_df = pd.DataFrame(data=targets_data)
targets_df.columns = ['targets']

# creating tensor from targets_df
torch_tensor = torch.tensor(targets_df['targets'].values)

# printing out result
print(torch_tensor)

Output:

tensor([ 0.5827,  0.5881,  0.1543,  0.6815,  0.9400,  0.8683,  0.4289,
0.5940, 0.6438, 0.7514], dtype=torch.float64)

Tested with Pytorch 0.4.0.

I hope this helps, if you have any further questions - just ask. :)

pytorch tensor from pandas columns of vectors

If you have this dataframe which each column is a vector of 2 numbers:

import torch
import pandas as pd

df = pd.DataFrame({'a': [[ 3, 29],[ 3, 29]],
'b': [[94, 170],[ 3, 29]],
'c': [[31, 115],[ 3, 29]]})

Sample Image

To convert this dataframe to a pytorch tensor, you only need to convert the values of dataframe to list and then a tensor:

t = torch.Tensor(list(df.values))

#output

tensor([[[ 3., 29.],
[ 94., 170.],
[ 31., 115.]],

[[ 3., 29.],
[ 3., 29.],
[ 3., 29.]]])

The shape of t is [2,3,2] is 2 rows, 3 columns, 2 elements inside each list.

Convert list of two dimensional DataFrame to Torch Tensor

I don't think you can convert the list of dataframes in a single command, but you can convert the list of dataframes into a list of tensors and then concatenate the list.

E.g.

import pandas as pd
import numpy as np
import torch

data = [pd.DataFrame(np.zeros((5,50))) for x in range(100)]

list_of_arrays = [np.array(df) for df in data]
torch.tensor(np.stack(list_of_arrays))

#or

list_of_tensors = [torch.tensor(np.array(df)) for df in data]
torch.stack(list_of_tensors)


Related Topics



Leave a reply



Submit