Find the Average of Fields in the Columns

How to get the average of a column in MySQL

The built-in AVG function (an aggregate function) could be used like so:

select avg(rating) from table_name

Note that, like most aggregate functions, the average will exclude null values (the average of 1, 2, null is 1.5 instead of 1.0). Also, in MySQL the return datatype will be decimal if you're averaging decimal or integer columns so use the appropriate C# datatype.

How to find average of a particular field in Scala

You can do simply the following

val text = sc.textFile("/neerja/input.txt")

val fourth = text.map(line => line.split("\\t"))
      .map(arr => Try(arr(4).toDouble) getOrElse(0.0)).mean()

println(fourth)

you should get the average of the 5th column subject

updated

If average of all the subject columns are required, I would suggest you to create dataframe. Dataframes are optimized RDD and many inbuilt functions are available for computation.

For creating a dataframe for the data given you would require a schema.

import org.apache.spark.sql.types.{DoubleType, IntegerType, StructField, StructType}
val schema = StructType(Seq(
  StructField("Sn", IntegerType, true),
  StructField("subject1", DoubleType, true),
  StructField("subject2", DoubleType, true),
  StructField("subject3", DoubleType, true),
  StructField("subject4", DoubleType, true)
))

RDD[Row] needs to be created as

val data = text.map(line => line.split("\\t"))
  .map(arr => Row.fromSeq(Seq(arr(0).toInt, Try(arr(1).asInstanceOf[DoubleType]) getOrElse(0.0),Try(arr(2).toDouble) getOrElse(0.0),Try(arr(3).toDouble) getOrElse(0.0),Try(arr(4).toDouble) getOrElse(0.0))))

finally dataframe is created

val df = sqlContext.createDataFrame(data, schema)

average of each columns can be calculated by using mean function as

df.select(mean("subject1").as("averageOFS1"),mean("subject2").as("averageOFS2"),mean("subject3").as("averageOFS3"),mean("subject4").as("averageOFS4")).show(false)

which should give you dataframe

+------------------+-----------------+-----------+-----------------+
|averageOFS1       |averageOFS2      |averageOFS3|averageOFS4      |
+------------------+-----------------+-----------+-----------------+
|21.796166666666668|4.661666666666666|5.24965    |7.919609688333335|
+------------------+-----------------+-----------+-----------------+

pandas get column average/mean

If you only want the mean of the weight column, select the column (which is a Series) and call .mean():

In [479]: df
Out[479]: 
         ID  birthyear    weight
0    619040       1962  0.123123
1    600161       1963  0.981742
2  25602033       1963  1.312312
3    624870       1987  0.942120

In [480]: df["weight"].mean()
Out[480]: 0.83982437500000007

Find the average of two combined columns in sql

By definition, AVG(col1) = SUM(col1)/COUNT(*) and AVG(col2) = SUM(col2)/COUNT(*), therefore (SUM(col1)+SUM(col2))/COUNT(*) = AVG(col1) + AVG(col2).

Also, the commutativity of addition gives us (SUM(col1)+SUM(col2))/COUNT(*) = SUM(col1+col2)/COUNT(*) and hence AVG(col1+col2).

Calculate AVERAGE from 2 columns for each row in SQL

You need to add the fields together and divide by the number of fields. If your Average field is of DECIMAL type you don't really even need to specify the ROUND function. Any decimal exceeding the declaration will just be truncated (SQL Fiddle) :

UPDATE table_name 
SET AVERAGE = (grade1 + grade2) / 2;

In your example you only have two fields that you are getting the average of. So Average decimal(3,1) would work for you since the most the decimal portion will ever be is .5. So the ROUND function is clearly not needed.

Row-wise average for a subset of columns with missing values

You can simply:

df['avg'] = df.mean(axis=1)

       Monday  Tuesday  Wednesday        avg
Mike       42      NaN         12  27.000000
Jenna     NaN      NaN         15  15.000000
Jon        21        4          1   8.666667

because .mean() ignores missing values by default: see docs.

To select a subset, you can:

df['avg'] = df[['Monday', 'Tuesday']].mean(axis=1)

       Monday  Tuesday  Wednesday   avg
Mike       42      NaN         12  42.0
Jenna     NaN      NaN         15   NaN
Jon        21        4          1  12.5

calculate average in separate column over period and group by date Standard SQL BigQuery

You can use coalesce to return the avg grouped by date, and if it's null return the total average of the column instead using a subquery:

select date, coalesce(avg(rate), (select avg(rate) from my_table))
from my_table
group by date