Difference between mean(c(1,2,21)) and mean(1,2,21)
mean(c(1,2,21))
#[1] 8
This passes a vector of three elements to the mean
function and the mean value of these three elements is calculated.
mean(1,2,21)
#[1] 1
This passes 1
as the first argument, 2
as the second argument and 21
as the third argument to the mean
function. mean
passes these arguments to mean.default
. In help("mean.default")
you can find the arguments of this function:
- The object you want the mean for.
- the fraction (0 to 0.5) of observations to be trimmed from each end of x before the mean is computed. Values of trim outside that range are taken as the nearest endpoint.
- a logical value indicating whether NA values should be stripped before the computation proceeds. (Since you pass a numeric value, it is coerced to logical automatically).
So you calculate this:
mean.default(1, 0.5, TRUE)
[1] 1
C++: what does (ab) mean?
1 << 1 means:
00000000 00000001 changes to 00000000 00000010
1 << 8 means:
00000000 00000001 changes to 00000001 00000000
It's a bit shift operation. For every 1 on the right, you can think of yourself as multiplying the value on the left by 2. So, 2 << 1 = 4 and 2 << 2 = 8. This is much more efficient than doing 1 * 2.
Also, you can do 4 >> 1 = 2 (and 5 >> 1 = 2 since you round down) as the inverse operation.
What is the difference between i = i + 1 and i += 1 in a 'for' loop?
The difference is that one modifies the data-structure itself (in-place operation) b += 1
while the other just reassigns the variable a = a + 1
.
Just for completeness:
x += y
is not always doing an in-place operation, there are (at least) three exceptions:
If
x
doesn't implement an__iadd__
method then thex += y
statement is just a shorthand forx = x + y
. This would be the case ifx
was something like anint
.If
__iadd__
returnsNotImplemented
, Python falls back tox = x + y
.The
__iadd__
method could theoretically be implemented to not work in place. It'd be really weird to do that, though.
As it happens your b
s are numpy.ndarray
s which implements __iadd__
and return itself so your second loop modifies the original array in-place.
You can read more on this in the Python documentation of "Emulating Numeric Types".
These [
__i*__
] methods are called to implement the augmented arithmetic assignments (+=
,-=
,*=
,@=
,/=
,//=
,%=
,**=
,<<=
,>>=
,&=
,^=
,|=
). These methods should attempt to do the operation in-place (modifying self) and return the result (which could be, but does not have to be, self). If a specific method is not defined, the augmented assignment falls back to the normal methods. For instance, if x is an instance of a class with an__iadd__()
method,x += y
is equivalent tox = x.__iadd__(y)
. Otherwise,x.__add__(y)
andy.__radd__(x)
are considered, as with the evaluation ofx + y
. In certain situations, augmented assignment can result in unexpected errors (see Why doesa_tuple[i] += ["item"]
raise an exception when the addition works?), but this behavior is in fact part of the data model.
The difference between bracket [ ] and double bracket [[ ]] for accessing the elements of a list or dataframe
The R Language Definition is handy for answering these types of questions:
- http://cran.r-project.org/doc/manuals/R-lang.html#Indexing
R has three basic indexing operators, with syntax displayed by the following examples
x[i]
x[i, j]
x[[i]]
x[[i, j]]
x$a
x$"a"
For vectors and matrices the[[
forms are rarely used, although they have some slight semantic differences from the[
form (e.g. it drops any names or dimnames attribute, and that partial matching is used for character indices). When indexing multi-dimensional structures with a single index,x[[i]]
orx[i]
will return thei
th sequential element ofx
.
For lists, one generally uses[[
to select any single element, whereas[
returns a list of the selected elements.
The[[
form allows only a single element to be selected using integer or character indices, whereas[
allows indexing by vectors. Note though that for a list, the index can be a vector and each element of the vector is applied in turn to the list, the selected component, the selected component of that component, and so on. The result is still a single element.
Closest subsequent index for a specified value
Find the location of each value (numeric or character)
int = c(1, 1, 0, 5, 2, 0, 0, 2)
value = 0
idx = which(int == value)
## [1] 3 6 7
Expand the index to indicate the nearest value of interest, using an NA after the last value in int
.
nearest = rep(NA, length(int))
nearest[1:max(idx)] = rep(idx, diff(c(0, idx))),
## [1] 3 3 3 6 6 6 7 NA
Use simple arithmetic to find the difference between the index of the current value and the index of the nearest value
abs(seq_along(int) - nearest)
## [1] 2 1 0 2 1 0 0 NA
Written as a function
f <- function(x, value) {
idx = which(x == value)
nearest = rep(NA, length(x))
if (length(idx)) # non-NA values only if `value` in `x`
nearest[1:max(idx)] = rep(idx, diff(c(0, idx)))
abs(seq_along(x) - nearest)
}
We have
> f(int, 0)
[1] 2 1 0 2 1 0 0 NA
> f(int, 1)
[1] 0 0 NA NA NA NA NA NA
> f(int, 2)
[1] 4 3 2 1 0 2 1 0
> f(char, "A")
[1] 0 2 1 0 0
> f(char, "B")
[1] 1 0 NA NA NA
> f(char, "C")
[1] 2 1 0 NA NA
The solution doesn't involve recursion or R-level loops, so should e fast even for long vectors.
NA problem when calculating mean by group
df <- within(df, {new = ave(old, groupID, FUN= function(x) mean(x, na.rm=TRUE))})
This in case you don't want to rewrite all your input data in a different (numeric) format
Compute differences between all variable pairs in R
Using base r:
df_dist <- t(apply(df, 1, dist))
colnames(df_dist) <- apply(combn(names(df), 2), 2, paste0, collapse = "_")
If you really want to use a tidy-approach, you could go with c_across
, but this also removes the names, and is much slower if your data is huge
What is the difference between '/' and '//' when used for division?
In Python 3.x, 5 / 2
will return 2.5
and 5 // 2
will return 2
. The former is floating point division, and the latter is floor division, sometimes also called integer division.
In Python 2.2 or later in the 2.x line, there is no difference for integers unless you perform a from __future__ import division
, which causes Python 2.x to adopt the 3.x behavior.
Regardless of the future import, 5.0 // 2
will return 2.0
since that's the floor division result of the operation.
You can find a detailed description at PEP 238: Changing the Division Operator.
What does .view() do in PyTorch?
view()
reshapes the tensor without copying memory, similar to numpy's reshape()
.
Given a tensor a
with 16 elements:
import torch
a = torch.range(1, 16)
To reshape this tensor to make it a 4 x 4
tensor, use:
a = a.view(4, 4)
Now a
will be a 4 x 4
tensor. Note that after the reshape the total number of elements need to remain the same. Reshaping the tensor a
to a 3 x 5
tensor would not be appropriate.
What is the meaning of parameter -1?
If there is any situation that you don't know how many rows you want but are sure of the number of columns, then you can specify this with a -1. (Note that you can extend this to tensors with more dimensions. Only one of the axis value can be -1). This is a way of telling the library: "give me a tensor that has these many columns and you compute the appropriate number of rows that is necessary to make this happen".
This can be seen in this model definition code. After the line x = self.pool(F.relu(self.conv2(x)))
in the forward function, you will have a 16 depth feature map. You have to flatten this to give it to the fully connected layer. So you tell PyTorch to reshape the tensor you obtained to have specific number of columns and tell it to decide the number of rows by itself.
Related Topics
Clip Values Between a Minimum and Maximum Allowed Value in R
How to Create a Continuous Density Heatmap of 2D Scatter Data in R
What's the Difference Between Reactive Value and Reactive Expression
Is It Bad Practice to Access S4 Objects Slots Directly Using @
R - Common Title and Legend for Combined Plots
How to Add a Scale Bar (For Linear Distances) to Ggmap
How to Syntax Highlight Inline R Code in R Markdown
How to Get Currency Exchange Rates in R
Plotly as Png in Knitr/Rmarkdown
Ggplot2 - Shade Area Above Line
Update Multiple Data.Table Columns Elegantly
Find the Most Frequently Occuring Words in a Text in R
How to Know If R Is Running on 64 Bits Versus 32
Producing Subscripts in R Markdown
How to Rename a Variable in R Without Copying the Object