Calculate and Print the Average Value of Strings in a Column

calculate and print the average value of strings in a column

This is one way with awk:

$ awk '{a[$1]+=$2; ++b[$1]} END {for (i in a) print i, a[i]/b[i]}' file
435.212 108.899
435.25 108.9
435.238 108.9
435.262 108.9
435.275 108.9

Explanation

{a[$1]+=$2; ++b[$1]}

  • Store the z values (2nd column) in the array a.
  • Store the amount of elements for each x value (1st column) in the array b.

END {for (i in a) print i, a[i]/b[i]}'

  • Print the result looping through the values stored in the array.

To have another number format (4 float values for example) you can also use:

printf "%d %.4f\n", i, a[i]/b[i]

How to get the average length of the strings in each column in csv?

You can zip the rows and map the columns to len and use statistics.mean to calculate averages:

import csv
from statistics import mean
with open('someFile.csv', 'r', newline='') as f, open('results.csv', 'w', newline='') as output:
reader = csv.reader(f, delimiter=' ', skipinitialspace=True)
headers = next(reader)
writer = csv.writer(output, delimiter = ' ')
writer.writerow(headers)
writer.writerow([mean(map(len, col)) for col in zip(*reader)])

How to calculate mean values for a particular column in dataframe?

The below code applies a custom function that checks the first character of each element and calculates the average based on that.

import numpy as np
import pandas as pd
upper = 30
lower = 0

df = pd.DataFrame({'col1':['>20',np.NaN,'<5','12','>1','<10',np.NaN,'8']})
def avg(val):
if val is not np.NaN:
char = val[0]
if char == '>':
res = (float(val[1:])+upper)/2
elif char == '<':
res = (float(val[1:])+lower)/2
else:
res = float(val)
return res

print(df["col1"].apply(avg))

Output:

0    25.0
1 NaN
2 2.5
3 12.0
4 15.5
5 5.0
6 NaN
7 8.0

Bash column dependant average

Awk way without arrays presuming all numbers are grouped

 awk 'x~/./&&x!=$1{printf "%d\t%.1f\n",x,y/z;y=z=""}
{x=$1;z++;y+=$2}END{printf "%d\t%.1f\n",x,y/z}' file

9 152.0
391 284.8
394 206.7
450 193.3

Finding average length of strings in a list

total_avg = sum( map(len, strings) ) / len(strings)

The problem in your code is in this line of code :
total_size = sum(all_lengths)
There's no need to calculate this in each loop of the cycle.
Better make this after cycle.

calculating average of values in column of an array?

The problem is here:

 while((line = in.readLine()) != null){
builder.append(line);
builder.append("\n");
index++; //increment the index to move the next one up for the next line

String temp[] = line.split(",");
c4 = Double.parseDouble(temp[3]);
c5 = Double.parseDouble(temp[4]);
c6 = Double.parseDouble(temp[5]);
}

Your storing your values into temporary local (local to the while loop) variables.
These variables are re-assigned every loop so you lose the information.

You can do one of two things:

  1. Calculate the running SUM, and number of rows in order to calculate the average at the end. Average = SUM / COUNT
  2. Store all values in arraylists, and calculate average at the end.

Example:

double c4avg=0, c5avg=0, c6avg=0;

while((line = in.readLine()) != null){
builder.append(line);
builder.append("\n");
index++; //increment the index to move the next one up for the next line

String temp[] = line.split(",");
//Calculate Running Sum stored in AVG variable
c4avg += Double.parseDouble(temp[3]);
c5avg += Double.parseDouble(temp[4]);
c6avg += Double.parseDouble(temp[5]);
}
//Divide by total rows to get average
c4avg/=index;
c5avg/=index;
c6avg/=index;


Related Topics



Leave a reply



Submit