How to Calculate Range Between the Dataframe Values Using Python

how to calculate range between the dataframe values using python

df = pd.DataFrame({
'8. Requirement': ['.685-.695', '.340-.350', '.737-.740', 'foo', '42'],
'9.Results': [.68, .345, '.739', '.68', 'bar']
})
# or df = pd.read_csv('filename.csv', sep='\t')

df = df.join(df['8. Requirement'].str.extract('(\d*\.?\d+)-(\d*\.?\d+)').rename(columns={0:'min', 1:'max'}))
df['OK'] = pd.to_numeric(df['9.Results'], errors='coerce').between(df['min'].astype(float), df['max'].astype(float))
print(df)

Output:

  8. Requirement 9.Results   min   max     OK
0 .685-.695 0.68 .685 .695 False
1 .340-.350 0.345 .340 .350 True
2 .737-.740 .739 .737 .740 True
3 foo .68 NaN NaN False
4 42 bar NaN NaN False

Calculating the range of values in a Pandas DataFrame using groupby function

High is not in df, please change High with your column

df.groupby("ChildID").apply(lambda x: x['abdomcirc'].max() - x['abdomcirc'].min())

Range of values in Pandas

You could try defining a custom range function, such as:

def calc_range(x):
return np.max(x) - np.min(x)

and then passing it as a function in agg:

data.groupby("DPI").agg({"SUM_ALL" :["count",pd.Series.mode,"mean","median","min","max", calc_range]})

determine the range of a value using a look up table

You can use a bit of numpy vectorial operations to generate masks, and use them to select your labels:

import numpy as np

a = numbers['number'].values # numpy array of numbers
r = ranges.set_index('range') # dataframe of min/max with labels as index

m1 = (a>=r['range_min'].values[:,None]).T # is number above each min
m2 = (a<r['range_max'].values[:,None]).T # is number below each max
m3 = (m1&m2) # combine both conditions above
# NB. the two operations could be done without the intermediate variables m1/m2

m4 = m3.sum(1) # how many matches?
# 0 -> out_of_range
# 2 -> overlap
# 1 -> get column name

# now we select the label according to the conditions
numbers['detected_range'] = np.select([m4==0, m4==2], # out_of_range and overlap
['out_of_range', 'overlap'],
# otherwise get column name
default=np.take(r.index, m3.argmax(1))
)

output:

   number detected_range
0 50 out_of_range
1 65 out_of_range
2 75 C
3 85 B
4 90 overlap

edit:

It works with any number of intervals in ranges

example output with extra['D',50,51]:

   number detected_range
0 50 D
1 65 out_of_range
2 75 C
3 85 B
4 90 overlap

calculating a range based on the fields of a pandas dataframe

Try this:

d.apply(lambda x: np.arange(x['start'], x['end']+1), axis=1)

Output:

0          [1, 2, 3, 4]
1 [4, 5, 6, 7, 8, 9]
2 [6, 7, 8, 9, 10]
3 [8, 9, 10, 11, 12]
dtype: object

Note: np.arange and range are not designed to accept pd.Series, therefore you can use apply rowwise to create ranges.

How to select a range of values in a pandas dataframe column?

Use between with inclusive=False for strict inequalities:

df['two'].between(-0.5, 0.5, inclusive=False)

The inclusive parameter determines if the endpoints are included or not (True: <=, False: <). This applies to both signs. If you want mixed inequalities, you'll need to code them explicitly:

(df['two'] >= -0.5) & (df['two'] < 0.5)

How can i calculate for Average true range in pandas

It looks like you might be trying to do the following :

import pandas as pd
from numpy.random import rand

df = pd.DataFrame(rand(10,5),columns={'High-Low','High-close','Low-close','A','B'})

cols = ['High-Low','High-close','Low-close']
df['true_range'] = df[cols].max(axis=1)
print(df)

The output will look like

   High-Low  Low-close         B         A  High-close  true_range
0 0.916121 0.026572 0.082619 0.672000 0.605287 0.916121
1 0.622589 0.944646 0.638486 0.905139 0.262275 0.944646
2 0.611374 0.756191 0.829803 0.828205 0.614956 0.756191
3 0.810638 0.501693 0.504800 0.069532 0.283825 0.810638
4 0.984463 0.900823 0.434061 0.905273 0.518056 0.984463
5 0.377742 0.480266 0.018676 0.383831 0.819448 0.819448
6 0.473753 0.652077 0.730400 0.305507 0.396969 0.652077
7 0.427047 0.733135 0.526076 0.542852 0.719194 0.733135
8 0.911629 0.633997 0.101848 0.020811 0.327233 0.911629
9 0.244624 0.893365 0.278941 0.354696 0.678280 0.893365

If this isn't what you had in mind, it would be helpful to clarify your question by providing a small example where you clearly identify the columns and the index in your DataFrame and what you mean by "true range".

Pandas / python to autogenerate new values through a range

Here's one way to do it:

import pandas as pd
df = pd.DataFrame({
'value' : ['a', 'b'],
'range1' : [0, 4],
'range2' : [3, 6],
'color' : ['blue', 'yellow']
})
df['output'] = df.apply(lambda x: [x['value'] + str(i + 1) for i in range(x['range1'], x['range2'])], axis=1)
df = df.explode('output', ignore_index=True)[['output', 'value', 'color']]
print(df)

Output

  output value   color
0 a1 a blue
1 a2 a blue
2 a3 a blue
3 b5 b yellow
4 b6 b yellow


Related Topics



Leave a reply



Submit