Python's Equivalent for R's Dput() Function

Print pandas data frame for reproducible example (equivalent to dput in R)

If binary data is OK for you, you can use the pickle library. It usually allows to serialize and deserialize arbitraty objects (on condition that their class definition is provided, which is true for dataframes, if pandas is installed).

If you need a human-readable format, you can create a Python dictionary from your dataframe with df_dict = df.to_dict(), and print this dictionary (to look at it and maybe copy-paste), or dump it to a JSON string.

When you want to convert a dict back to pandas, use df = pd.DataFrame.from_dict(df_dict).

A minimal example of decoding and encoding:

import pandas as pd
df = pd.DataFrame.from_dict({'a': {0: 1, 1: 2}, 'b': {0: 3, 1: 3}})
print(df.to_dict())

which results in the {'a': {0: 1, 1: 2}, 'b': {0: 3, 1: 3}} copy-able object.

Is there a R function equivalent to Python's split()?

You can use strsplit to split string and as.numeric to convert to numbers

# read input, value is stored as a string
cf = readline(prompt='Enter cashflow:')
pr = readline(prompt='Enter corresponding probability:')

# split on ' ', unlist list-of-list, then convert to numeric
cf = as.numeric(unlist(strsplit(cf, ' ')))
pr = as.numeric(unlist(strsplit(pr, ' ')))

e = cf*pr

Equivalent to R's dput in Julia

I think you are looking for repr:

julia> A = rand(2, 2);

julia> repr(A)
"[0.427705 0.0971806; 0.395074 0.168961]"

Read a Pandas dataframe into R

if you have data already in R you can run:

library(reticulate)
d <- py_to_r(py_eval('my_data().reset_index()'))
d

which should give you:

  year month       DKF       GDT       GSB       HKZ       SLG       SRL       UAB       UKE
1  2000     1 1.3517226       NaN 1.4273286 1.3554525 1.5487504 2.0594033 1.5548904 1.2689103
2  2000     2 1.2830497       NaN 1.3664731 1.4951631 1.5415838 1.3791095 1.5095901 1.4488509
3  2000     3 1.2455233       NaN 1.1615090 1.2105682 1.2282127 1.7505922 1.0071871 1.1215747
4  2000     4 0.7299107       NaN 0.7851300 0.8072930 0.6324440 0.8446109 0.9035629 0.9697809
5  2000     5 0.7835660       NaN 0.7422343 0.9269770 0.7158862 1.2051921 0.4465355 0.8580554

if you are dealing with time series data:

xts::xts(d[-(1:2)], zoo::as.yearmon(paste(d[,1], d[,2]), '%Y %m'))

               DKF       GDT       GSB       HKZ       SLG       SRL       UAB       UKE       UKF
Jan 2000 1.3517226       NaN 1.4273286 1.3554525 1.5487504 2.0594033 1.5548904 1.2689103 1.3249434
Feb 2000 1.2830497       NaN 1.3664731 1.4951631 1.5415838 1.3791095 1.5095901 1.4488509 1.4834390
Mar 2000 1.2455233       NaN 1.1615090 1.2105682 1.2282127 1.7505922 1.0071871 1.1215747 1.1720744
Apr 2000 0.7299107       NaN 0.7851300 0.8072930 0.6324440 0.8446109 0.9035629 0.9697809 0.9721228
May 2000 0.7835660       NaN 0.7422343 0.9269770 0.7158862 1.2051921 0.4465355 0.8580554 0.8515449
Jun 2000 0.9378140       NaN 1.0315809 0.8022017 0.8454897 1.0101836 0.7661409 0.7055803 0.6193634
Jul 2000 0.6590646       NaN 0.7242494 0.6965228 0.4551865 0.6508079 0.5077752 0.5691778 0.5370455
Aug 2000 0.6744555       NaN 0.5766163 0.4565830 0.5826159 0.7407540 0.3953245 0.5093119 0.4304210

And as ftable format:

a <- ftable(xtabs(value~., pivot_longer(d, -c(Year, month))), row.vars = 1:2)
replace(a, a==0, NaN)
           name       DKF       GDT       GSB       HKZ       SLG       SRL       UAB       UKE       UKF       UOR       UTH       WH1       WH3
Year month                                                                                                                                       
2000 1          1.3517226       NaN 1.4273286 1.3554525 1.5487504 2.0594033 1.5548904 1.2689103 1.3249434       NaN 1.3359813 1.4796781 1.4338344
     2          1.2830497       NaN 1.3664731 1.4951631 1.5415838 1.3791095 1.5095901 1.4488509 1.4834390       NaN 1.4988411 1.4655891 1.3849536
     3          1.2455233       NaN 1.1615090 1.2105682 1.2282127 1.7505922 1.0071871 1.1215747 1.1720744       NaN 1.1487176 1.1907311 1.1535912
     4          0.7299107       NaN 0.7851300 0.8072930 0.6324440 0.8446109 0.9035629 0.9697809 0.9721228       NaN 0.9738686 0.7624104 0.7809207
     5          0.7835660       NaN 0.7422343 0.9269770 0.7158862 1.2051921 0.4465355 0.8580554 0.8515449       NaN 0.8745675 0.6570087 0.6605269
     6          0.9378140       NaN 1.0315809 0.8022017 0.8454897 1.0101836 0.7661409 0.7055803 0.6193634       NaN 0.6443392 1.0220573 1.0027971
     7          0.6590646       NaN 0.7242494 0.6965228 0.4551865 0.6508079 0.5077752 0.5691778 0.5370455       NaN 0.5668678 0.7128321 0.7459095
     8          0.6744555       NaN 0.5766163 0.4565830 0.5826159 0.7407540 0.3953245 0.5093119 0.4304210       NaN 0.4364251 0.5808601 0.6482779
     9          1.0028705       NaN 1.0332126 0.8919033 0.9977976 0.8294642 0.9392679 0.9217522 0.8263760       NaN 0.8719464 1.0547152 1.0724845
     10         0.9667005       NaN 1.1857205 1.3495322 1.1389897 1.1360024 1.3380420 1.2829829 1.3281730       NaN 1.3778435 1.2250852 1.1842061
     11         1.0678912       NaN 1.3665028 1.5861421 1.0930165 0.7332514 1.0678859 1.3798931 1.4011762       NaN 1.5008642 1.4736262 1.3706178
     12         0.9112284       NaN 1.1573820 1.4914961 0.8898335 0.7179855 1.4888334 1.4692246 1.6555818       NaN 1.5967169 1.1945190 1.1447962
2001 1          0.8881926       NaN 0.9902550 1.1734640 0.8842005 1.2405939 0.9310732 1.2514806 1.3357608       NaN 1.3473679 1.0201318 1.0056653
     2          1.0094984       NaN 0.9724394 0.9458569 1.0275629 1.3443045 1.2383118 0.9622780 1.0341980       NaN 1.0010888 1.0128601 1.0245672
     3          0.9234819       NaN 1.0488150 1.0691428 0.8746407 0.6103847 1.0869269 1.1523633 1.1212607       NaN 1.1149275 0.9979750 1.0022481
     4          0.8256953       NaN 0.8273479 1.1488059 0.7131039 0.7889615 1.0308784 1.1708995 1.1261248       NaN 1.1446114 0.7657471 0.7777727
     5          0.7823149       NaN 0.7430594 1.0912435 0.6837955 0.8836293 0.3581697 0.8924892 0.8756814       NaN 0.9704882 0.6834538 0.7053399
     6          0.5865619       NaN 0.7536325 0.7592430 0.4790986 0.6398611 0.7424054 0.6389295 0.5645819       NaN 0.5855553 0.7291586 0.7832286
     7          0.5986398       NaN 0.6349135 0.7202558 0.4988441 0.9101470 0.5807096 0.6533906 0.6105116       NaN 0.6431702 0.5463985 0.5623766
     8          0.9152978       NaN 0.7826830 0.7608616 0.8223210 0.7110037 0.5961037 0.7189733 0.6219141       NaN 0.6673872 0.7511936 0.7462191
     9          0.7957461       NaN 1.0233964 1.1328043 0.7417117 0.6944427 1.0412149 1.0480404 1.0519839       NaN 1.0184463 0.9255564 0.9638197
     10         1.2289234       NaN 1.4129187 1.4500452 1.3284355 0.7431633 1.5274150 1.4336149 1.3938912       NaN 1.4218742 1.5042704 1.4669123
     11         1.2159037       NaN 1.2605810 1.0657650 1.3409614 1.6415510 1.0127948 1.0187401 1.0481970       NaN 1.0718512 1.3057348 1.2973547
     12         0.9411120       NaN 0.9576845 1.1213623 0.9324084 1.3732159 1.2298052 1.2025621 1.2687238       NaN 1.2613274 0.9515946 0.9414454

Is there an equivalent of R's dput() for Matlab?

UPDATE 1: Added recursion and support for cells!

UPDATE 2: Added support for structures!

UPDATE 3: Added support for logicals, integers, complex doubles. Added unit tests. Posted to FileExchange at: http://www.mathworks.com/matlabcentral/fileexchange/34076

NOTE: Check github at https://github.com/johncolby/dput for all further updates.

There is no built-in equivalent, but the template to create one is simple enough, so I thought I'd start making it. Just loop over the variables and write a string equivalent depending on the type of the data.

I started a git repository for this, so feel free to fork it and help me out with different data types. I'll post it on FileExchange when the basic types are complete (double, char, struct, cell at least).

https://github.com/johncolby/dput

Starting with some example variables

x = 1:10;
y = 3;
z = magic(3);
mystr = ['line1'; 'line2'];
mystruct = mystruct = struct('index', num2cell(1:3), 'color', {'red', 'blue', 'green'}, 'misc', {'string' 4 num2cell(magic(3))})
mycell = {1:3, 'test'; [], 1};

the basic usage is:

>> dput(x, y, z, mystr, mystruct, mycell)

ans =

x        = reshape([1.000000 2.000000 3.000000 4.000000 5.000000 6.000000 7.000000 8.000000 9.000000 10.000000 ],[1  10])                                                                                                                                                                                                                                                                                                                                                                                                                                                                      ;
y        = reshape([3.000000 ],[1  1])                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         ;
z        = reshape([8.000000 3.000000 4.000000 1.000000 5.000000 9.000000 6.000000 7.000000 2.000000 ],[3  3])                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 ;
mystr    = reshape('lliinnee12',[2  5])                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        ;
mystruct = struct('index',reshape({reshape([1.000000 ],[1  1]) reshape([2.000000 ],[1  1]) reshape([3.000000 ],[1  1]) },[1  3]),'color',reshape({reshape('red',[1  3]) reshape('blue',[1  4]) reshape('green',[1  5]) },[1  3]),'misc',reshape({reshape('string',[1  6]) reshape([4.000000 ],[1  1]) reshape({reshape([8.000000 ],[1  1]) reshape([3.000000 ],[1  1]) reshape([4.000000 ],[1  1]) reshape([1.000000 ],[1  1]) reshape([5.000000 ],[1  1]) reshape([9.000000 ],[1  1]) reshape([6.000000 ],[1  1]) reshape([7.000000 ],[1  1]) reshape([2.000000 ],[1  1]) },[3  3]) },[1  3]));
mycell   = reshape({reshape([1.000000 2.000000 3.000000 ],[1  3]) reshape([ ],[0  0]) reshape('test',[1  4]) reshape([1.000000 ],[1  1]) },[2  2])                                                                                                                                                                                                                                                                                                                                                                                                                                             ;

Then you can just paste the text online to make a reproducible example, and others can copy/paste back into MATLAB to regenerate the variables. Just like for R!

Python equivalent to dplyr's summarize

You are looking for the aggregate or the agg function. thus you could have:

pd.merge(ordr_pr, prods, how='inner', on='product_id').groupby(order_id).agg({'product_name':list})