Reading multiple files and calculating mean based on user input
That's the way I fixed it:
pollutantmean <- function(directory, pollutant, id = 1:332) {
#set the path
path = directory
#get the file List in that directory
fileList = list.files(path)
#extract the file names and store as numeric for comparison
file.names = as.numeric(sub("\\.csv$","",fileList))
#select files to be imported based on the user input or default
selected.files = fileList[match(id,file.names)]
#import data
Data = lapply(file.path(path,selected.files),read.csv)
#convert into data frame
Data = do.call(rbind.data.frame,Data)
#calculate mean
mean(Data[,pollutant],na.rm=TRUE)
}
The last question is that my function should call "specdata" (the directory name where all the csv's are located) as the directory, is there a directory type object in r?
suppose i call the function as:
pollutantmean(specdata, "niterate", 1:10)
It should get the path of specdata directory which is on my working directory... how can I do that?
Reading data from .txt file and calculating mean in Python
You can create a list of values by splitting by '\n' and convert those values to float, after that you can calculate the mean of that list using the mean from statistics:
from statistics import mean
with open('inputdata.txt','r') as fin:
data=[float(x) for x in fin.read().split('\n')]
average = mean(data)
print(average)
How to loop through text files, find the average of each, and store it in a dataframe in R?
Select the numeric columns, unlist
them to a vector and calculate mean
.
library(dplyr)
library(purrr)
library(vroom)
map_dbl(Filenames, ~ vroom(.x) %>%
select(where(is.numeric)) %>%
unlist %>% mean(na.rm = TRUE)) -> mean_values
mean_values
Column means over multiple files
Using the data.table
library:
library(data.table)
# reading each file as a data.table. Bonus - fread is much faster than read.csv
m <- lapply(Files, fread, header=TRUE, comment.char="#")
#compiling into one dataset
m2 <- rbindlist(m)
#calculating mean by id over each column
m2[,lapply(.SD,mean),by="id"]
Building a mean across several csv files
Based on your example e.g. 16 files for 10:25, i.e. 010.csv, 011.csv, 012.csv, etc.
Under the assumption that your naming convention follows the order of the files in the directory, you could try:
csvFiles <- list.files(pattern="\\.csv")[10:15]#here [10:15] ... in production use your function parameter here
file_list <- vector('list', length=length(csvFiles))
df_list <- lapply(X=csvFiles, read.csv, header=TRUE)
names(df_list) <- csvFiles #OPTIONAL: if you want to rename (later rows) to the csv list
df <- do.call("rbind", df_list)
mean(df[ ,"columnName"])
These code snippets should be possible to pimp and incorprate into your routine.
Bash: Finding average of entries from multiple columns after reading a CSV text file
Trying to fix OP's attempt here and adding logic to get average of averages at last of the file's reading. Written on mobile so couldn't test it should work in case I got the thought correct by OP's description.
awk -F, '
$2~/[24680]$/{
count++
for(i=3;i<=7;i++){
sum+=$i
}
tot+=sum/5
sum=0
}
END{
print "Average of averages is: " (count?tot/count:"NaN")
}
' user-list.txt > superuser.txt
Related Topics
Check for Installed Packages Before Running Install.Packages()
Update a Value in One Column Based on Criteria in Other Columns
Why Does R Use Partial Matching
How to Display All X Labels in R Barplot
How to Remove Empty Factors from Ggplot2 Facets
Replacing Numbers Within a Range with a Factor
How to Calculate Combination and Permutation in R
Converting Latitude and Longitude Points to Utm
Plot Polynomial Regression Curve in R
When Importing CSV into R How to Generate Column with Name of the CSV
How to Wait for a Keypress in R
Get Values and Positions to Label a Ggplot Histogram
R Knitr Chunk Options for Figure Height/Width Are Not Working
Re-Ordering Factor Levels in Data Frame
Longest Common Substring in R Finding Non-Contiguous Matches Between the Two Strings
Remove Backslashes from Character String