Pandas: How to easily share a sample dataframe using df.to_dict()?
The answer:
In many situations, using an approach with df.to_dict()
will do the job perfectly! Here are two cases that come to mind:
Case 1: You've got a dataframe built or loaded in Python from a local source
Case 2: You've got a table in another application (like Excel)
The details:
Case 1: You've got a dataframe built or loaded from a local source
Given that you've got a pandas dataframe named df
, just
- run
df.to_dict()
in you console or editor, and - copy the output that is formatted as a dictionary, and
- paste the content into
pd.DataFrame(<output>)
and include that chunk in your now reproducible code snippet.
Case 2: You've got a table in another application (like Excel)
Depending on the source and separator like (',', ';' '\\s+')
where the latter means any spaces, you can simply:
Ctrl+C
the contents- run
df=pd.read_clipboard(sep='\\s+')
in your console or editor, and - run
df.to_dict()
, and - include the output in
df=pd.DataFrame(<output>)
In this case, the start of your question would look something like this:
import pandas as pd
df = pd.DataFrame({0: {0: 0.25474768796402636, 1: 0.5792136563952824, 2: 0.5950396800676201},
1: {0: 0.9071073567355232, 1: 0.1657288354283053, 2: 0.4962367707789421},
2: {0: 0.7440601352930207, 1: 0.7755487356392468, 2: 0.5230707257648775}})
Of course, this gets a little clumsy with larger dataframes. But very often, all anyone who seeks to answer your question need is a little sample of your real world data to take the structure of your data into consideration.
And there are two ways you can handle larger dataframes:
- run
df.head(20).to_dict()
to only include the first20 rows
, and - change the format of your dict using, for example,
df.to_dict('split')
(there are other options besides'split'
) to reshape your output to a dict that requires fewer lines.
Here's an example using the iris dataset, among other places available from plotly express.
If you just run:
import plotly.express as px
import pandas as pd
df = px.data.iris()
df.to_dict()
This will produce an output of nearly 1000 lines, and won't be very practical as a reproducible sample. But if you include .head(25)
, you'll get:
{'sepal_length': {0: 5.1, 1: 4.9, 2: 4.7, 3: 4.6, 4: 5.0, 5: 5.4, 6: 4.6, 7: 5.0, 8: 4.4, 9: 4.9},
'sepal_width': {0: 3.5, 1: 3.0, 2: 3.2, 3: 3.1, 4: 3.6, 5: 3.9, 6: 3.4, 7: 3.4, 8: 2.9, 9: 3.1},
'petal_length': {0: 1.4, 1: 1.4, 2: 1.3, 3: 1.5, 4: 1.4, 5: 1.7, 6: 1.4, 7: 1.5, 8: 1.4, 9: 1.5},
'petal_width': {0: 0.2, 1: 0.2, 2: 0.2, 3: 0.2, 4: 0.2, 5: 0.4, 6: 0.3, 7: 0.2, 8: 0.2, 9: 0.1},
'species': {0: 'setosa', 1: 'setosa', 2: 'setosa', 3: 'setosa', 4: 'setosa', 5: 'setosa', 6: 'setosa', 7: 'setosa', 8: 'setosa', 9: 'setosa'},
'species_id': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1}}
And now we're getting somewhere. But depending on the structure and content of the data, this may not cover the complexity of the contents in a satisfactory manner. But you can include more data on fewer lines by including to_dict('split')
like this:
import plotly.express as px
df = px.data.iris().head(10)
df.to_dict('split')
Now your output will look like:
{'index': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
'columns': ['sepal_length',
'sepal_width',
'petal_length',
'petal_width',
'species',
'species_id'],
'data': [[5.1, 3.5, 1.4, 0.2, 'setosa', 1],
[4.9, 3.0, 1.4, 0.2, 'setosa', 1],
[4.7, 3.2, 1.3, 0.2, 'setosa', 1],
[4.6, 3.1, 1.5, 0.2, 'setosa', 1],
[5.0, 3.6, 1.4, 0.2, 'setosa', 1],
[5.4, 3.9, 1.7, 0.4, 'setosa', 1],
[4.6, 3.4, 1.4, 0.3, 'setosa', 1],
[5.0, 3.4, 1.5, 0.2, 'setosa', 1],
[4.4, 2.9, 1.4, 0.2, 'setosa', 1],
[4.9, 3.1, 1.5, 0.1, 'setosa', 1]]}
And now you can easily increase the number in .head(10)
without cluttering your question too much. But there's one minor drawback. Now you can no longer use the input directly in pd.DataFrame
. But if you include a few specifications with regards to index, column, and data
you'll be just fine. So for this particluar dataset, my preferred approach would be:
import pandas as pd
import plotly.express as px
sample = {'index': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14],
'columns': ['sepal_length',
'sepal_width',
'petal_length',
'petal_width',
'species',
'species_id'],
'data': [[5.1, 3.5, 1.4, 0.2, 'setosa', 1],
[4.9, 3.0, 1.4, 0.2, 'setosa', 1],
[4.7, 3.2, 1.3, 0.2, 'setosa', 1],
[4.6, 3.1, 1.5, 0.2, 'setosa', 1],
[5.0, 3.6, 1.4, 0.2, 'setosa', 1],
[5.4, 3.9, 1.7, 0.4, 'setosa', 1],
[4.6, 3.4, 1.4, 0.3, 'setosa', 1],
[5.0, 3.4, 1.5, 0.2, 'setosa', 1],
[4.4, 2.9, 1.4, 0.2, 'setosa', 1],
[4.9, 3.1, 1.5, 0.1, 'setosa', 1],
[5.4, 3.7, 1.5, 0.2, 'setosa', 1],
[4.8, 3.4, 1.6, 0.2, 'setosa', 1],
[4.8, 3.0, 1.4, 0.1, 'setosa', 1],
[4.3, 3.0, 1.1, 0.1, 'setosa', 1],
[5.8, 4.0, 1.2, 0.2, 'setosa', 1]]}
df = pd.DataFrame(index=sample['index'], columns=sample['columns'], data=sample['data'])
df
Now you'll have this dataframe to work with:
sepal_length sepal_width petal_length petal_width species species_id
0 5.1 3.5 1.4 0.2 setosa 1
1 4.9 3.0 1.4 0.2 setosa 1
2 4.7 3.2 1.3 0.2 setosa 1
3 4.6 3.1 1.5 0.2 setosa 1
4 5.0 3.6 1.4 0.2 setosa 1
5 5.4 3.9 1.7 0.4 setosa 1
6 4.6 3.4 1.4 0.3 setosa 1
7 5.0 3.4 1.5 0.2 setosa 1
8 4.4 2.9 1.4 0.2 setosa 1
9 4.9 3.1 1.5 0.1 setosa 1
10 5.4 3.7 1.5 0.2 setosa 1
11 4.8 3.4 1.6 0.2 setosa 1
12 4.8 3.0 1.4 0.1 setosa 1
13 4.3 3.0 1.1 0.1 setosa 1
14 5.8 4.0 1.2 0.2 setosa 1
Which will increase your chances of receiving useful answers significantly!
Edit:
df_to_dict()
will not be able to read timestamps like 1: Timestamp('2020-01-02 00:00:00')
without also including from pandas import Timestamp
Convert a Pandas DataFrame to a dictionary
The to_dict()
method sets the column names as dictionary keys so you'll need to reshape your DataFrame slightly. Setting the 'ID' column as the index and then transposing the DataFrame is one way to achieve this.
to_dict()
also accepts an 'orient' argument which you'll need in order to output a list of values for each column. Otherwise, a dictionary of the form {index: value}
will be returned for each column.
These steps can be done with the following line:
>>> df.set_index('ID').T.to_dict('list')
{'p': [1, 3, 2], 'q': [4, 3, 2], 'r': [4, 0, 9]}
In case a different dictionary format is needed, here are examples of the possible orient arguments. Consider the following simple DataFrame:
>>> df = pd.DataFrame({'a': ['red', 'yellow', 'blue'], 'b': [0.5, 0.25, 0.125]})
>>> df
a b
0 red 0.500
1 yellow 0.250
2 blue 0.125
Then the options are as follows.
dict - the default: column names are keys, values are dictionaries of index:data pairs
>>> df.to_dict('dict')
{'a': {0: 'red', 1: 'yellow', 2: 'blue'},
'b': {0: 0.5, 1: 0.25, 2: 0.125}}
list - keys are column names, values are lists of column data
>>> df.to_dict('list')
{'a': ['red', 'yellow', 'blue'],
'b': [0.5, 0.25, 0.125]}
series - like 'list', but values are Series
>>> df.to_dict('series')
{'a': 0 red
1 yellow
2 blue
Name: a, dtype: object,
'b': 0 0.500
1 0.250
2 0.125
Name: b, dtype: float64}
split - splits columns/data/index as keys with values being column names, data values by row and index labels respectively
>>> df.to_dict('split')
{'columns': ['a', 'b'],
'data': [['red', 0.5], ['yellow', 0.25], ['blue', 0.125]],
'index': [0, 1, 2]}
records - each row becomes a dictionary where key is column name and value is the data in the cell
>>> df.to_dict('records')
[{'a': 'red', 'b': 0.5},
{'a': 'yellow', 'b': 0.25},
{'a': 'blue', 'b': 0.125}]
index - like 'records', but a dictionary of dictionaries with keys as index labels (rather than a list)
>>> df.to_dict('index')
{0: {'a': 'red', 'b': 0.5},
1: {'a': 'yellow', 'b': 0.25},
2: {'a': 'blue', 'b': 0.125}}
Return generator instead of list from df.to_dict()
There is not a way to get a generator directly from to_dict(orient='records')
. However, it is possible to modify the to_dict
source code to be a generator instead of returning a list comprehension:
from pandas.core.common import standardize_mapping
from pandas.core.dtypes.cast import maybe_box_native
def dataframe_records_gen(df_):
columns = df_.columns.tolist()
into_c = standardize_mapping(dict)
for row in df_.itertuples(index=False, name=None):
yield into_c(
(k, maybe_box_native(v)) for k, v in dict(zip(columns, row)).items()
)
Sample Code:
import pandas as pd
df = pd.DataFrame({
'A': [1, 2],
'B': [3, 4]
})
# Using Generator
for row in dataframe_records_gen(df):
print(row)
# For Comparison with to_dict function
print("to_dict", df.to_dict(orient='records'))
Output:
{'A': 1, 'B': 3}
{'A': 2, 'B': 4}
to_dict [{'A': 1, 'B': 3}, {'A': 2, 'B': 4}]
For more natural syntax, it's also possible to register a custom accessor:
import pandas as pd
from pandas.core.common import standardize_mapping
from pandas.core.dtypes.cast import maybe_box_native
@pd.api.extensions.register_dataframe_accessor("gen")
class GenAccessor:
def __init__(self, pandas_obj):
self._obj = pandas_obj
def records(self):
columns = self._obj.columns.tolist()
into_c = standardize_mapping(dict)
for row in self._obj.itertuples(index=False, name=None):
yield into_c(
(k, maybe_box_native(v))
for k, v in dict(zip(columns, row)).items()
)
Which makes this generator accessible via the gen
accessor in this case:
df = pd.DataFrame({
'A': [1, 2],
'B': [3, 4]
})
# Using Generator through registered custom accessor
for row in df.gen.records():
print(row)
# For Comparison with to_dict function
print("to_dict", df.to_dict(orient='records'))
Output:
{'A': 1, 'B': 3}
{'A': 2, 'B': 4}
to_dict [{'A': 1, 'B': 3}, {'A': 2, 'B': 4}]
Pandas DataFrame to List of Dictionaries
Use df.to_dict('records')
-- gives the output without having to transpose externally.
In [2]: df.to_dict('records')
Out[2]:
[{'customer': 1L, 'item1': 'apple', 'item2': 'milk', 'item3': 'tomato'},
{'customer': 2L, 'item1': 'water', 'item2': 'orange', 'item3': 'potato'},
{'customer': 3L, 'item1': 'juice', 'item2': 'mango', 'item3': 'chips'}]
Converting a pandas Dataframe to a list of dictionary
First you need to change columns to desired output, then set orient
argument to records
when exporting to dict with to_dict().
df.columns = ["parameterCode", "parameterName", "unit", "result", "uRange", "lRange"]
res = df.to_dict(orient='records')
print(res)
[{'parameterCode': 27162, 'parameterName': 'MPV (Mean Platelet Volume)', 'unit': 'fL', 'result': '12.3', 'uRange': 6.5, 'lRange': 12.0}, {'parameterCode': 29789, 'parameterName': 'Platelet Count', 'unit': 'thou/mm3', 'result': '156', 'uRange': 150.0, 'lRange': 450.0}, {'parameterCode': 29790, 'parameterName': 'Segmented Neutrophils', 'unit': '%', 'result': '72', 'uRange': 40.0, 'lRange': 80.0}, {'parameterCode': 29791, 'parameterName': 'Lymphocytes', 'unit': '%', 'result': '22', 'uRange': 20.0, 'lRange': 40.0}, {'parameterCode': 29792, 'parameterName': 'Monocytes', 'unit': '%', 'result': '3.7', 'uRange': 2.0, 'lRange': 10.0}, {'parameterCode': 29793, 'parameterName': 'Eosinophils', 'unit': '%', 'result': '1.8', 'uRange': 1.0, 'lRange': 6.0}, {'parameterCode': 29794, 'parameterName': 'Basophils', 'unit': '%', 'result': '0.5', 'uRange': nan, 'lRange': 2.0}, {'parameterCode': 29795, 'parameterName': 'Neutrophils', 'unit': 'thou/mm3', 'result': '6.74', 'uRange': 2.0, 'lRange': 7.0}, {'parameterCode': 29796, 'parameterName': 'Lymphocytes', 'unit': 'thou/mm3', 'result': '2.06', 'uRange': 1.0, 'lRange': 3.0}, {'parameterCode': 29797, 'parameterName': 'Monocytes', 'unit': 'thou/mm3', 'result': '0.35', 'uRange': 0.2, 'lRange': 1.0}, {'parameterCode': 29798, 'parameterName': 'Eosinophils', 'unit': 'thou/mm3', 'result': '0.17', 'uRange': 0.02, 'lRange': 0.5}, {'parameterCode': 29799, 'parameterName': 'Basophils', 'unit': 'thou/mm3', 'result': '0.05', 'uRange': 0.01, 'lRange': 0.1}, {'parameterCode': 29806, 'parameterName': 'Hemoglobin', 'unit': 'g/dL', 'result': '7', 'uRange': 11.5, 'lRange': 15.0}, {'parameterCode': 29807, 'parameterName': 'Packed Cell Volume (PCV)', 'unit': '%', 'result': '26.7', 'uRange': 36.0, 'lRange': 46.0}, {'parameterCode': 29808, 'parameterName': 'Total Leukocyte Count (TLC)', 'unit': 'thou/mm3', 'result': '9.36', 'uRange': 4.0, 'lRange': 10.0}, {'parameterCode': 29809, 'parameterName': 'RBC Count', 'unit': 'mill/mm3', 'result': '3.54', 'uRange': 3.8, 'lRange': 4.8}, {'parameterCode': 29810, 'parameterName': 'MCV', 'unit': 'fL', 'result': '75.4', 'uRange': 80.0, 'lRange': 100.0}, {'parameterCode': 29811, 'parameterName': 'MCH', 'unit': 'pg', 'result': '19.8', 'uRange': 27.0, 'lRange': 32.0}, {'parameterCode': 29812, 'parameterName': 'MCHC', 'unit': 'g/dL', 'result': '26.2', 'uRange': 32.0, 'lRange': 35.0}, {'parameterCode': 29813, 'parameterName': 'Red Cell Distribution Width (RDW)', 'unit': '%', 'result': '17.8', 'uRange': 11.5, 'lRange': 14.5}]
Sampling data from the pandas dataframe
If you're re-assigning frequency to the original dataframe, that's probably the issue. Make sure you don't have duplicate labels and weights going into your sampling.
Using your summary data I can generate 5000 samples which do have (roughly) the same distribution as the original:
In [1]: import pandas as pd
In [2]: summary = pd.DataFrame(
...: [
...: ['A', 0.350019],
...: ['B', 0.209966],
...: ['C', 0.126553],
...: ['D', 0.100983],
...: ['E', 0.053767],
...: ['F', 0.039378],
...: ['G', 0.029529],
...: ['H', 0.019056],
...: ['I', 0.016783],
...: ['J', 0.014813],
...: ['K', 0.014152],
...: ['L', 0.013477],
...: ['M', 0.009444],
...: ['N', 0.002082],
...: ],
...: columns=['label', 'freq']
...: )
You can sample from the summary table, weighting each unique label with the frequency in the original dataset:
In [3]: summary.label.sample(
...: n=5000,
...: weights=summary.freq,
...: replace=True,
...: ).value_counts(normalize=True)
Out[3]:
label
A 0.3448
B 0.2198
C 0.1356
D 0.0952
E 0.0488
F 0.0322
G 0.0284
H 0.0234
I 0.0168
J 0.0162
K 0.0146
L 0.0140
M 0.0090
N 0.0012
dtype: float64
Alternatively, you could simply skip the calculation of the frequencies altogether - pandas will do this for you:
In [7]: df = pd.DataFrame(np.random.choice(["A", "B", "C", "D"], size=20_000, p=[0.6, 0.3, 0.05, 0.05]), columns=["label"])
In [8]: df.label.sample(5000, replace=True).value_counts(normalize=True)
Out[8]:
A 0.5994
B 0.2930
C 0.0576
D 0.0500
Name: label, dtype: float64
The issue with the code in your question is that you end up weighting based on frequency and based on the explicit weights (which also account for frequency):
In [2]: df = pd.DataFrame(np.random.choice(["A", "B", "C", "D"], size=20_000, p=[0.6, 0.3, 0.05, 0.05]), columns=["label"])
In [3]: df['frequency'] = df.groupby('label')['label'].transform('count')
In [4]: df
Out[4]:
label frequency
0 A 11908
1 A 11908
2 B 5994
3 B 5994
4 D 1033
... ... ...
19995 A 11908
19996 D 1033
19997 A 11908
19998 A 11908
19999 A 11908
The result is roughly equal to the normalized square of each frequency:
In [6]: freqs = np.array([0.6, 0.3, 0.05, 0.05])
In [7]: (freqs ** 2) / (freqs ** 2).sum()
Out[7]:
array([0.79120879, 0.1978022 , 0.00549451, 0.00549451])
Convert Python dict into a dataframe
The error here, is since calling the DataFrame constructor with scalar values (where it expects values to be a list/dict/... i.e. have multiple columns):
pd.DataFrame(d)
ValueError: If using all scalar values, you must must pass an index
You could take the items from the dictionary (i.e. the key-value pairs):
In [11]: pd.DataFrame(d.items()) # or list(d.items()) in python 3
Out[11]:
0 1
0 2012-07-02 392
1 2012-07-06 392
2 2012-06-29 391
3 2012-06-28 391
...
In [12]: pd.DataFrame(d.items(), columns=['Date', 'DateValue'])
Out[12]:
Date DateValue
0 2012-07-02 392
1 2012-07-06 392
2 2012-06-29 391
But I think it makes more sense to pass the Series constructor:
In [21]: s = pd.Series(d, name='DateValue')
Out[21]:
2012-06-08 388
2012-06-09 388
2012-06-10 388
In [22]: s.index.name = 'Date'
In [23]: s.reset_index()
Out[23]:
Date DateValue
0 2012-06-08 388
1 2012-06-09 388
2 2012-06-10 388
How to convert dataframe to dictionary in pandas WITHOUT index
When I see your dataset with 2 columns I see a series and not a dataframe.
Try this: d = df.set_index('name')['coverage'].to_dict()
which will convert your dataframe to a series and output that.
However, if your intent is to have more columns and not a common key you could store them in an array instead using 'records'. d = df.to_dict('r')
.
`
Runnable code:
import pandas as pd
df = pd.DataFrame({
'name': ['Jason'],
'coverage': [25.1]
})
print(df.to_dict())
print(df.set_index('name')['coverage'].to_dict())
print(df.to_dict('r'))
Returns:
{'name': {0: 'Jason'}, 'coverage': {0: 25.1}}
{'Jason': 25.1}
[{'name': 'Jason', 'coverage': 25.1}]
And one more thing, try to avoid to use variable name dict as it is reserved.
How to make good reproducible pandas examples
Note: The ideas here are pretty generic for Stack Overflow, indeed questions.
Disclaimer: Writing a good question is hard.
The Good:
do include small* example DataFrame, either as runnable code:
In [1]: df = pd.DataFrame([[1, 2], [1, 3], [4, 6]], columns=['A', 'B'])
or make it "copy and pasteable" using
pd.read_clipboard(sep='\s\s+')
, you can format the text for Stack Overflow highlight and use Ctrl+K (or prepend four spaces to each line), or place three backticks (```) above and below your code with your code unindented:In [2]: df
Out[2]:
A B
0 1 2
1 1 3
2 4 6test
pd.read_clipboard(sep='\s\s+')
yourself.* I really do mean small. The vast majority of example DataFrames could be fewer than 6 rows [citation needed], and I bet I can do it in 5 rows. Can you reproduce the error with
df = df.head()
? If not, fiddle around to see if you can make up a small DataFrame which exhibits the issue you are facing.* Every rule has an exception, the obvious one is for performance issues (in which case definitely use %timeit and possibly %prun), where you should generate:
df = pd.DataFrame(np.random.randn(100000000, 10))
. Consider usingnp.random.seed
so we have the exact same frame. Saying that, "make this code fast for me" is not strictly on topic for the site.write out the outcome you desire (similarly to above)
In [3]: iwantthis
Out[3]:
A B
0 1 5
1 4 6Explain what the numbers come from: the 5 is sum of the B column for the rows where A is 1.
do show the code you've tried:
In [4]: df.groupby('A').sum()
Out[4]:
B
A
1 5
4 6But say what's incorrect: the A column is in the index rather than a column.
do show you've done some research (search the documentation, search Stack Overflow), and give a summary:
The docstring for sum simply states "Compute sum of group values"
The groupby documentation doesn't give any examples for this.
Aside: the answer here is to use
df.groupby('A', as_index=False).sum()
.if it's relevant that you have Timestamp columns, e.g. you're resampling or something, then be explicit and apply
pd.to_datetime
to them for good measure**.df['date'] = pd.to_datetime(df['date']) # this column ought to be date..
** Sometimes this is the issue itself: they were strings.
The Bad:
don't include a MultiIndex, which we can't copy and paste (see above). This is kind of a grievance with Pandas' default display, but nonetheless annoying:
In [11]: df
Out[11]:
C
A B
1 2 3
2 6The correct way is to include an ordinary DataFrame with a
set_index
call:In [12]: df = pd.DataFrame([[1, 2, 3], [1, 2, 6]], columns=['A', 'B', 'C']).set_index(['A', 'B'])
In [13]: df
Out[13]:
C
A B
1 2 3
2 6do provide insight to what it is when giving the outcome you want:
B
A
1 1
5 0Be specific about how you got the numbers (what are they)... double check they're correct.
If your code throws an error, do include the entire stack trace (this can be edited out later if it's too noisy). Show the line number (and the corresponding line of your code which it's raising against).
The Ugly:
don't link to a CSV file we don't have access to (ideally don't link to an external source at all...)
df = pd.read_csv('my_secret_file.csv') # ideally with lots of parsing options
Most data is proprietary we get that: Make up similar data and see if you can reproduce the problem (something small).
don't explain the situation vaguely in words, like you have a DataFrame which is "large", mention some of the column names in passing (be sure not to mention their dtypes). Try and go into lots of detail about something which is completely meaningless without seeing the actual context. Presumably no one is even going to read to the end of this paragraph.
Essays are bad, it's easier with small examples.
don't include 10+ (100+??) lines of data munging before getting to your actual question.
Please, we see enough of this in our day jobs. We want to help, but not like this....
Cut the intro, and just show the relevant DataFrames (or small versions of them) in the step which is causing you trouble.
Anyway, have fun learning Python, NumPy and Pandas!
Related Topics
How to Pass a Variable Between Flask Pages
How to Use a Decimal Step Value For Range()
Why Can't I Iterate Twice Over the Same Data
How to Uninstall Python 2.7 on a MAC Os X 10.6.4
How to Get a Substring of a String in Python
Python Multiprocessing Picklingerror: Can't Pickle ≪Type 'Function'≫
Confused by Python File Mode "W+"
How to Test If a String Contains One of the Substrings in a List, in Pandas
Convert String to Variable Name in Python
How to Convert a .Py to .Exe For Python
How to Play an Mp3 With Pygame
Should I Put #! (Shebang) in Python Scripts, and What Form Should It Take
Why Is Python Running My Module When I Import It, and How to Stop It
Manually Raising (Throwing) an Exception in Python
Pretty-Print a Numpy Array Without Scientific Notation and With Given Precision