Percentage value with GNU Diff
Something like this perhaps?
Two files, A1 and A2.
$ sdiff -B -b -s A1 A2 | wc
would give you how many lines differed. wc gives total, just divide.
The -b and -B are to ignore blanks and blank lines, and -s says to suppress the common lines.
I'm trying to make a function to calculate percent difference for all pair combinations within a group in a vector
Using tidyverse:
library(tidyverse)
df %>%
group_by(grp = str_extract(Levelname, "\\w+"))%>%
summarise(pair = combn(Levelname, 2, str_c, collapse = " - "),
perc_diff = combn(y, 2, function(x) 200*abs(diff(x))/sum(x)),
.groups = 'drop')
A tibble: 12 x 3
grp pair perc_diff
<chr> <chr> <dbl>
1 B B 1 - B 2 45.1
2 B B 1 - B 3 26.1
3 B B 1 - B 4 15.3
4 B B 2 - B 3 19.7
5 B B 2 - B 4 30.4
6 B B 3 - B 4 10.9
7 D D 1 - D 2 39.8
8 D D 1 - D 3 9.42
9 D D 1 - D 4 38.6
10 D D 2 - D 3 30.6
11 D D 2 - D 4 1.24
12 D D 3 - D 4 29.4
Calculate percentage between two values
I'm not familiar with why you have (x,0) as a syntax
But I see that you have
(COUNT(ApprovalProvision.ClaimNumber),0) - (COUNT(Submitted.ClaimNumber),0)
/COUNT(Submitted.ClaimNumber) * 100
shouldn't it be,
( COUNT(ApprovalProvision.ClaimNumber) - COUNT(Submitted.ClaimNumber) )
/COUNT(Submitted.ClaimNumber) * 100
It looks like it would do count of ApprovalProvision.ClaimNumber - 100 since submitted.claimnumber divided by itself is 1 times 100 is 100.
The 4900 number actually sounds right. Lets take the following example, you have 2 apples, and then you're given 98 more and got 100 apples.
An increase of 98% would have meant from 2 apples, you would have 3.96 apples.
An increase of 100% means from 2 apples you end with 4 apples. An increase of 1000% means from 2 apples you end with 22 apples. So 4000% means you end with 82 apples. 5000% means from 2 apples, you reach 102 apples.
(100-2)/2*100 = 98 / 2 = 49 * 100 = 4900, so it looks like there is a 4900% increase in number of apples if you started with 2 apples and reach 100.
Now if you had flipped the 2 and 100, say starting with 100, now you have 2,
(2-100)/100*100 = -98, so a -98% change of apples, or a 98% decrease.
Hope this solves your problem.
How to check how much data has changed without storing two versions
I don't know of a generic algorithm that does this. But given your constraints then I think its pretty straightforward.
Calculate a 32-bit hash of every line in the CSV and store them in a sorted array. You then compare hashes. If 10% of your hashes have changed then likely 10% of your file has changed. ( as a percentage of lines )
If this is too large, then calculate the 32-bit hash of each csv line, but store the last 8 bits of each hash in a histogram. E.g. if you have 10 hashes where the last byte was 0, then hist[0] = 10. You can then compute roughly how many lines have changed.
This structure would be really small - like 256 32-bit numbers. ( about 1k )
This is not perfect since when a line changes it moves to another bucket, but some lines in that bucket may also come out, masking the ones that went in. This is a problem with hash collisions. As you store more bits the data structure gets larger, but more accurate since the hash collisions will be fewer.
You can increase or decrease your odds of a hash collision by increasing the number of hash bits you use in your histogram. For example if you did this using the lower 12 bits of each hash, your hash collisions would be many fewer - the data structure could be 4k 32-bit numbers, or 16k.
calculate percentage difference - python
You are loosing precision when performing (val_2)/val_1
so convert either one of them to float to get the end result as floats and then convert the result to int
values = [0.11889, 0.07485, 0.01070, 0.03076, 0.01606]
values = [int(round(i*100)) for i in values]
conversion_values = []
for x in range(1, len(values), 1):
val_1 = values[x-1]
if val_1 == 0.0: #Check if val_1 is 0.
conversion_values.append('-')
else:
val_2 = values[x]
diff = int(round((float(val_2)/val_1)*100)) # change to float -->round--> int
conversion_values.append(diff)
conversion_values
Output:
[58, 14, 300, 67]
Kusto query to get percentage value of events over time
There are a couple of ways to achieve this, first, calculate the hourly avg as an additional column then calculate the diffs from the hourly average:
let minuteValues = customEvents
| where name == "EventICareAbout"
| extend channel = customDimensions["ChannelName"]
| summarize events=count() by bin(timestamp, 1m), tostring(channel)
| extend Day = startofday(timestamp), hour =hourofday(timestamp);
let hourlyAverage = customEvents
| where name == "EventICareAbout"
| extend channel = customDimensions["ChannelName"]
| summarize events=count() by bin(timestamp, 1m), tostring(channel)
| summarize hourlyAvgEvents = avg(events) by bin(timestamp,1h), tostring(channel)
| extend Day = startofday(timestamp),hour =hourofday(timestamp);
minuteValues
| lookup hourlyAverage on hour, Day
| extend Diff = events- hourlyAvgEvents
Another option is to use the built-in Anomaly detection
How do I calculate percentages from two tables in Django
When you define a ForeignKey
it creates a "reverse" field on the related object. You can name this using related_name
, otherwise it defaults to <modelname>_set
(modelname is lowercased). In this case, donation_set
That's probably what you were missing. The code will be something like
@property
def percent_raised(self):
total = 0.0
for donation in self.donation_set.all():
total += float( donation.raised)
return total / float( self.target_donation) * 100.0
It's more efficient in this case but much less generalizable, to calculate the sum of donations in the DB query using an aggregation function. See the cheat sheet here (third example using Avg
, but in this case you'd want Sum
not Avg
)
percent symbol in Bash, what's it used for?
Delete the shortest match of string
in $var
from the beginning:
${var#string}
Delete the longest match of string
in $var
from the beginning:
${var##string}
Delete the shortest match of string
in $var
from the end:
${var%string}
Delete the longest match of string
in $var
from the end:
${var%%string}
Try:
var=foobarbar
echo "${var%b*r}"
> foobar
echo "${var%%b*r}"
> foo
Related Topics
Awk One Liner Select Only Rows Based on Value of a Column
How to Untar a Tar.Bz File in Unix
When to Use Kernel Threads VS Workqueues in the Linux Kernel
How to Clear the Line Number in Vim When Copying
How to Send List of File in a Folder to a Txt File in Linux
Best Way to Monitor File System Changes in Linux
Glibc: Elf File Os Abi Invalid
How to Make R Read My Environmental Variables
Why Does a Syscall Clobber Rcx and R11
How to Sort Files Numerically from Linux Command Line
Replacing Environment Variables in a Properties File
How to Look Up a Variable by Name with #!/Bin/Sh (Posix Sh)
Ubuntu - Run Command on Start-Up with "Sudo"
How to Automatically Start a Node.Js Application in Amazon Linux Ami on Aws
Location of .Bashrc for "Bash on Ubuntu on Windows" in Windows 10
Linux Shared Library That Uses a Shared Library Undefined Symbol
Strange Behaviour of Git: Mysterious Changes Cannot Be Undone