Python's Equivalent for R's Dput() Function

Print pandas data frame for reproducible example (equivalent to dput in R)

If binary data is OK for you, you can use the pickle library. It usually allows to serialize and deserialize arbitraty objects (on condition that their class definition is provided, which is true for dataframes, if pandas is installed).

If you need a human-readable format, you can create a Python dictionary from your dataframe with df_dict = df.to_dict(), and print this dictionary (to look at it and maybe copy-paste), or dump it to a JSON string.

When you want to convert a dict back to pandas, use df = pd.DataFrame.from_dict(df_dict).

A minimal example of decoding and encoding:

import pandas as pd
df = pd.DataFrame.from_dict({'a': {0: 1, 1: 2}, 'b': {0: 3, 1: 3}})
print(df.to_dict())

which results in the {'a': {0: 1, 1: 2}, 'b': {0: 3, 1: 3}} copy-able object.

Is there a R function equivalent to Python's split()?

You can use strsplit to split string and as.numeric to convert to numbers

# read input, value is stored as a string
cf = readline(prompt='Enter cashflow:')
pr = readline(prompt='Enter corresponding probability:')

# split on ' ', unlist list-of-list, then convert to numeric
cf = as.numeric(unlist(strsplit(cf, ' ')))
pr = as.numeric(unlist(strsplit(pr, ' ')))

e = cf*pr

Equivalent to R's dput in Julia

I think you are looking for repr:

julia> A = rand(2, 2);

julia> repr(A)
"[0.427705 0.0971806; 0.395074 0.168961]"

Read a Pandas dataframe into R

if you have data already in R you can run:

library(reticulate)
d <- py_to_r(py_eval('my_data().reset_index()'))
d

which should give you:

  year month       DKF       GDT       GSB       HKZ       SLG       SRL       UAB       UKE
1 2000 1 1.3517226 NaN 1.4273286 1.3554525 1.5487504 2.0594033 1.5548904 1.2689103
2 2000 2 1.2830497 NaN 1.3664731 1.4951631 1.5415838 1.3791095 1.5095901 1.4488509
3 2000 3 1.2455233 NaN 1.1615090 1.2105682 1.2282127 1.7505922 1.0071871 1.1215747
4 2000 4 0.7299107 NaN 0.7851300 0.8072930 0.6324440 0.8446109 0.9035629 0.9697809
5 2000 5 0.7835660 NaN 0.7422343 0.9269770 0.7158862 1.2051921 0.4465355 0.8580554

if you are dealing with time series data:

xts::xts(d[-(1:2)], zoo::as.yearmon(paste(d[,1], d[,2]), '%Y %m'))

DKF GDT GSB HKZ SLG SRL UAB UKE UKF
Jan 2000 1.3517226 NaN 1.4273286 1.3554525 1.5487504 2.0594033 1.5548904 1.2689103 1.3249434
Feb 2000 1.2830497 NaN 1.3664731 1.4951631 1.5415838 1.3791095 1.5095901 1.4488509 1.4834390
Mar 2000 1.2455233 NaN 1.1615090 1.2105682 1.2282127 1.7505922 1.0071871 1.1215747 1.1720744
Apr 2000 0.7299107 NaN 0.7851300 0.8072930 0.6324440 0.8446109 0.9035629 0.9697809 0.9721228
May 2000 0.7835660 NaN 0.7422343 0.9269770 0.7158862 1.2051921 0.4465355 0.8580554 0.8515449
Jun 2000 0.9378140 NaN 1.0315809 0.8022017 0.8454897 1.0101836 0.7661409 0.7055803 0.6193634
Jul 2000 0.6590646 NaN 0.7242494 0.6965228 0.4551865 0.6508079 0.5077752 0.5691778 0.5370455
Aug 2000 0.6744555 NaN 0.5766163 0.4565830 0.5826159 0.7407540 0.3953245 0.5093119 0.4304210

And as ftable format:

a <- ftable(xtabs(value~., pivot_longer(d, -c(Year, month))), row.vars = 1:2)
replace(a, a==0, NaN)
name DKF GDT GSB HKZ SLG SRL UAB UKE UKF UOR UTH WH1 WH3
Year month
2000 1 1.3517226 NaN 1.4273286 1.3554525 1.5487504 2.0594033 1.5548904 1.2689103 1.3249434 NaN 1.3359813 1.4796781 1.4338344
2 1.2830497 NaN 1.3664731 1.4951631 1.5415838 1.3791095 1.5095901 1.4488509 1.4834390 NaN 1.4988411 1.4655891 1.3849536
3 1.2455233 NaN 1.1615090 1.2105682 1.2282127 1.7505922 1.0071871 1.1215747 1.1720744 NaN 1.1487176 1.1907311 1.1535912
4 0.7299107 NaN 0.7851300 0.8072930 0.6324440 0.8446109 0.9035629 0.9697809 0.9721228 NaN 0.9738686 0.7624104 0.7809207
5 0.7835660 NaN 0.7422343 0.9269770 0.7158862 1.2051921 0.4465355 0.8580554 0.8515449 NaN 0.8745675 0.6570087 0.6605269
6 0.9378140 NaN 1.0315809 0.8022017 0.8454897 1.0101836 0.7661409 0.7055803 0.6193634 NaN 0.6443392 1.0220573 1.0027971
7 0.6590646 NaN 0.7242494 0.6965228 0.4551865 0.6508079 0.5077752 0.5691778 0.5370455 NaN 0.5668678 0.7128321 0.7459095
8 0.6744555 NaN 0.5766163 0.4565830 0.5826159 0.7407540 0.3953245 0.5093119 0.4304210 NaN 0.4364251 0.5808601 0.6482779
9 1.0028705 NaN 1.0332126 0.8919033 0.9977976 0.8294642 0.9392679 0.9217522 0.8263760 NaN 0.8719464 1.0547152 1.0724845
10 0.9667005 NaN 1.1857205 1.3495322 1.1389897 1.1360024 1.3380420 1.2829829 1.3281730 NaN 1.3778435 1.2250852 1.1842061
11 1.0678912 NaN 1.3665028 1.5861421 1.0930165 0.7332514 1.0678859 1.3798931 1.4011762 NaN 1.5008642 1.4736262 1.3706178
12 0.9112284 NaN 1.1573820 1.4914961 0.8898335 0.7179855 1.4888334 1.4692246 1.6555818 NaN 1.5967169 1.1945190 1.1447962
2001 1 0.8881926 NaN 0.9902550 1.1734640 0.8842005 1.2405939 0.9310732 1.2514806 1.3357608 NaN 1.3473679 1.0201318 1.0056653
2 1.0094984 NaN 0.9724394 0.9458569 1.0275629 1.3443045 1.2383118 0.9622780 1.0341980 NaN 1.0010888 1.0128601 1.0245672
3 0.9234819 NaN 1.0488150 1.0691428 0.8746407 0.6103847 1.0869269 1.1523633 1.1212607 NaN 1.1149275 0.9979750 1.0022481
4 0.8256953 NaN 0.8273479 1.1488059 0.7131039 0.7889615 1.0308784 1.1708995 1.1261248 NaN 1.1446114 0.7657471 0.7777727
5 0.7823149 NaN 0.7430594 1.0912435 0.6837955 0.8836293 0.3581697 0.8924892 0.8756814 NaN 0.9704882 0.6834538 0.7053399
6 0.5865619 NaN 0.7536325 0.7592430 0.4790986 0.6398611 0.7424054 0.6389295 0.5645819 NaN 0.5855553 0.7291586 0.7832286
7 0.5986398 NaN 0.6349135 0.7202558 0.4988441 0.9101470 0.5807096 0.6533906 0.6105116 NaN 0.6431702 0.5463985 0.5623766
8 0.9152978 NaN 0.7826830 0.7608616 0.8223210 0.7110037 0.5961037 0.7189733 0.6219141 NaN 0.6673872 0.7511936 0.7462191
9 0.7957461 NaN 1.0233964 1.1328043 0.7417117 0.6944427 1.0412149 1.0480404 1.0519839 NaN 1.0184463 0.9255564 0.9638197
10 1.2289234 NaN 1.4129187 1.4500452 1.3284355 0.7431633 1.5274150 1.4336149 1.3938912 NaN 1.4218742 1.5042704 1.4669123
11 1.2159037 NaN 1.2605810 1.0657650 1.3409614 1.6415510 1.0127948 1.0187401 1.0481970 NaN 1.0718512 1.3057348 1.2973547
12 0.9411120 NaN 0.9576845 1.1213623 0.9324084 1.3732159 1.2298052 1.2025621 1.2687238 NaN 1.2613274 0.9515946 0.9414454

Is there an equivalent of R's dput() for Matlab?

UPDATE 1: Added recursion and support for cells!

UPDATE 2: Added support for structures!

UPDATE 3: Added support for logicals, integers, complex doubles. Added unit tests. Posted to FileExchange at: http://www.mathworks.com/matlabcentral/fileexchange/34076

NOTE: Check github at https://github.com/johncolby/dput for all further updates.


There is no built-in equivalent, but the template to create one is simple enough, so I thought I'd start making it. Just loop over the variables and write a string equivalent depending on the type of the data.

I started a git repository for this, so feel free to fork it and help me out with different data types. I'll post it on FileExchange when the basic types are complete (double, char, struct, cell at least).

https://github.com/johncolby/dput

Starting with some example variables

x = 1:10;
y = 3;
z = magic(3);
mystr = ['line1'; 'line2'];
mystruct = mystruct = struct('index', num2cell(1:3), 'color', {'red', 'blue', 'green'}, 'misc', {'string' 4 num2cell(magic(3))})
mycell = {1:3, 'test'; [], 1};

the basic usage is:

>> dput(x, y, z, mystr, mystruct, mycell)

ans =

x = reshape([1.000000 2.000000 3.000000 4.000000 5.000000 6.000000 7.000000 8.000000 9.000000 10.000000 ],[1 10]) ;
y = reshape([3.000000 ],[1 1]) ;
z = reshape([8.000000 3.000000 4.000000 1.000000 5.000000 9.000000 6.000000 7.000000 2.000000 ],[3 3]) ;
mystr = reshape('lliinnee12',[2 5]) ;
mystruct = struct('index',reshape({reshape([1.000000 ],[1 1]) reshape([2.000000 ],[1 1]) reshape([3.000000 ],[1 1]) },[1 3]),'color',reshape({reshape('red',[1 3]) reshape('blue',[1 4]) reshape('green',[1 5]) },[1 3]),'misc',reshape({reshape('string',[1 6]) reshape([4.000000 ],[1 1]) reshape({reshape([8.000000 ],[1 1]) reshape([3.000000 ],[1 1]) reshape([4.000000 ],[1 1]) reshape([1.000000 ],[1 1]) reshape([5.000000 ],[1 1]) reshape([9.000000 ],[1 1]) reshape([6.000000 ],[1 1]) reshape([7.000000 ],[1 1]) reshape([2.000000 ],[1 1]) },[3 3]) },[1 3]));
mycell = reshape({reshape([1.000000 2.000000 3.000000 ],[1 3]) reshape([ ],[0 0]) reshape('test',[1 4]) reshape([1.000000 ],[1 1]) },[2 2]) ;

Then you can just paste the text online to make a reproducible example, and others can copy/paste back into MATLAB to regenerate the variables. Just like for R!

Python equivalent to dplyr's summarize

You are looking for the aggregate or the agg function. thus you could have:

pd.merge(ordr_pr, prods, how='inner', on='product_id').groupby(order_id).agg({'product_name':list})


Related Topics



Leave a reply



Submit