How to generalize outer to n dimensions?
This is one way: First use Vectorize
and outer
to define a function that creates an n-dimensional matrix where each entry is a list of arguments on which the given function will be applied:
list_args <- Vectorize( function(a,b) c( as.list(a), as.list(b) ),
SIMPLIFY = FALSE)
make_args_mtx <- function( alist ) {
Reduce(function(x, y) outer(x, y, list_args), alist)
}
Now multi.outer
just needs to invoke apply
and do.call
on this "args-matrix" :
multi.outer <- function(f, ... ) {
args <- make_args_mtx(list(...))
apply(args, 1:length(dim(args)), function(a) do.call(f, a[[1]] ) )
}
Let's try this with an example function:
fun <- function(a,b,c) paste(a,b,c)
ans <- multi.outer(fun, LETTERS[1:2], c(3, 4, 5), letters[6:7] )
> ans
, , 1
[,1] [,2] [,3]
[1,] "A 3 f" "A 4 f" "A 5 f"
[2,] "B 3 f" "B 4 f" "B 5 f"
, , 2
[,1] [,2] [,3]
[1,] "A 3 g" "A 4 g" "A 5 g"
[2,] "B 3 g" "B 4 g" "B 5 g"
Creating nonsense words - using outer() with three+ dimensions in R
You could create all different combinations with expand grid
:
apply(expand.grid(initial_consonants, vowels, final_consonants), 1, function(x)create_CVC_words(x[1], x[2], x[3]))
Does this do what you want?
How does inner product generalize to higher-dimensional arrays?
Suppose I have two conformable matrices and want to do an inner product.
a ← 5 2 ⍴ ⍳ 10
b ← 2 6 ⍴ ⍳ 10
a
1 2
3 4
5 6
7 8
9 10
b
1 2 3 4 5 6
7 8 9 10 1 2
a +.= b
1 0 0 0 0 1
0 0 1 0 0 0
0 0 0 0 1 0
0 1 0 0 0 0
0 0 0 1 0 0
a +.× b
15 18 21 24 7 10
31 38 45 52 19 26
47 58 69 80 31 42
63 78 93 108 43 58
79 98 117 136 55 74
What's important here is that the trailing dimension of a, that is, ¯1↑⍴a
, matches the leading dimension of b, or 1↑⍴b
. Similarly, the shape of the result is the catenation of the shapes of both arguments, less the trailing dimension of a and the leading dimension of b, or (¯1↓⍴a),1↓⍴b
.
Suppose now I had higher-dimensional arrays, then the same rules would apply. The trailing dimension of a must match the leading dimension of b etc.
The non-obvious generalisation to higher dimensions is that this operation is no different than an inner product of two matrices, provided you collapse the relevant dimensions.
a ← 5 1 2 1 2 1 2 ⍴ ⍳ 40
b ← 2 3 4 5 ⍴ ⍳ 120
To collapse the dimensions, simply multiply together all but the last dimension of a, and all but the first dimension of b.
a1 ← 20 2 ⍴ ⍳ 40
b1 = 2 60 ⍴ ⍳ 120
Do the operation
r1 ← a1 +.× b1
Lastly, put the collapsed dimensions back.
r ← 5 1 2 1 2 1 3 4 5 ⍴ r1
Try it!
Apply function with outer taking the columns of two matrices as the elements of interest
You are pretty close. As described in this related question, all you need is the Vectorize()
function to convert your Fun()
function into a vectorized version:
VecFun <- Vectorize( Fun )
Then you can simply do:
outer(d.cols, r.cols, VecFun )
E.g. if you define
Fun <- function(a,b) sum(a+b)
and r,d
matrices are defined as follows:
J <- 5
D <- 3
R <- 4
d <- matrix( 1:(J*D), J, D)
r <- matrix( 1:(J*R), J, R)
then you get this:
> outer(d.cols, r.cols, VecFun)
1 2 3 4
1 30 55 80 105
2 55 80 105 130
3 80 105 130 155
Is there a way to form sparse n-dimensional array in Python3?
In the spirit of coo
format I could generate a 3d sparse array representation:
In [106]: dims = 2,4,6
In [107]: data = np.zeros((10,4),int)
In [108]: data[:,-1] = 1
In [112]: for i in range(3):
...: data[:,i] = np.random.randint(0,dims[i],10)
In [113]: data
Out[113]:
array([[0, 2, 3, 1],
[0, 3, 4, 1],
[0, 0, 1, 1],
[0, 3, 0, 1],
[1, 1, 3, 1],
[1, 0, 2, 1],
[1, 1, 2, 1],
[0, 2, 5, 1],
[0, 1, 5, 1],
[0, 1, 2, 1]])
Does that meet your requirements? It's possible there are some duplicates. sparse.coo
sums duplicates before it converts the array to dense for display, or to csr
for calculations.
The corresponding dense array is:
In [130]: A=np.zeros(dims, int)
In [131]: for row in data:
...: A[tuple(row[:3])] += row[-1]
In [132]: A
Out[132]:
array([[[0, 1, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 1],
[0, 0, 0, 1, 0, 1],
[1, 0, 0, 0, 1, 0]],
[[0, 0, 1, 0, 0, 0],
[0, 0, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]])
(no duplicates in this case).
A 2d sparse matrix using a subset of this data is
In [118]: sparse.coo_matrix((data[:,3],(data[:,1],data[:,2])),(4,6)).A
Out[118]:
array([[0, 1, 1, 0, 0, 0],
[0, 0, 2, 1, 0, 1],
[0, 0, 0, 1, 0, 1],
[1, 0, 0, 0, 1, 0]])
That's in effect the sum over the first dimension.
I'm assuming that
M[dim1,dim2,dim3,...] = 1.0
means the non-zero elements of the array must have a data value of 1.
Pandas has a sparse data series and data frame format. That allows for a non-zero 'fill' value. I don't know if the multi-index version can be thought of as higher than 2d or not. There have been a few SO questions about converting the Pandas sparse arrays to/from the scipy sparse.
Convert Pandas SparseDataframe to Scipy sparse csc_matrix
http://pandas-docs.github.io/pandas-docs-travis/sparse.html#interaction-with-scipy-sparse
Outer product in tensorflow
Yes, you can do this by taking advantage of the broadcast semantics of tensorflow. Size the first out to size 1xN of itself, and the second to size Mx1 of itself, and you'll get a broadcast to MxN of all of the results when you multiply them.
(You can play around with the same thing in numpy to see how it behaves in a simpler context, btw:
a = np.array([1, 2, 3, 4, 5]).reshape([5,1])
b = np.array([6, 7, 8, 9, 10]).reshape([1,5])
a*b
How exactly you do it in tensorflow depends a bit on which axes you want to use and what semantics you want for the resulting multiply, but the general idea applies.
Outer/tensor product in R
It will be hard to beat the performance of outer
. This ends up doing a matrix multiplication which is done by the BLAS library. Calling outer
repeatedly doesn't matter either, since the last call will dominate both speed and memory wise. For example, for vectors of length 100, the last call is at least 100x slower than the previous one...
Your best bet to get the best performance here is to get the best BLAS library for R. The default one isn't very good. On Linux, you can fairly easily configure R to use ATLAS BLAS. On Windows it is harder, but possible. See R for Windows FAQ.
# multiple outer
mouter <- function(x1, ...) {
r <- x1
for(vi in list(...)) r <- outer(r, vi)
r
}
# Your example
d=3
x1 = 1:d
x2 = 1:d+3
x3 = 1:d+6
mouter(x1,x2,x3)
# Performance test
x <- runif(1e2)
system.time(mouter(x,x,x)) # 0 secs (less than 10 ms)
system.time(mouter(x,x,x,x)) # 0.5 secs / 0.35 secs (better BLAS)
I replaced my Windows Rblas.dll
with the DYNAMIC_ARCH version of GOTO BLAS at this place which improved the time from 0.5 to 0.35 secs as seen above.
Related Topics
Run Sweave or Knitr with Objects from Existing R Session
Non-Numeric Argument to Binary Operator Error in R
Select Columns Based on Multiple Strings with Dplyr Contains()
Why (Or When) Is Rscript (Or Littler) Better Than R Cmd Batch
What Does the @ Symbol Mean in R
Merge Nearest Date, and Related Variables from a Another Dataframe by Group
Leaflet Legend for Custom Markers in R
Object Not Found Error When Passing Model Formula to Another Function
Ggplot: Adding Regression Line Equation and R2 with Facet
How to Use R to Download a Zipped File from a Ssl Page That Requires Cookies
Error: Vector Memory Exhausted (Limit Reached) R 3.5.0 MACos
Why (Or When) Is Rscript (Or Littler) Better Than R Cmd Batch
Add Image in Title Page of Rmarkdown PDF