How to Generalize Outer to N Dimensions

How to generalize outer to n dimensions?

This is one way: First use Vectorize and outer to define a function that creates an n-dimensional matrix where each entry is a list of arguments on which the given function will be applied:

list_args <- Vectorize( function(a,b) c( as.list(a), as.list(b) ), 
                        SIMPLIFY = FALSE)

make_args_mtx <- function( alist ) {
  Reduce(function(x, y) outer(x, y, list_args), alist)
}

Now multi.outer just needs to invoke apply and do.call on this "args-matrix" :

multi.outer <- function(f, ... ) {
  args <- make_args_mtx(list(...))
  apply(args, 1:length(dim(args)), function(a) do.call(f, a[[1]] ) )
}

Let's try this with an example function:

fun <- function(a,b,c) paste(a,b,c)

ans <- multi.outer(fun, LETTERS[1:2], c(3, 4, 5), letters[6:7] )

> ans
, , 1

     [,1]    [,2]    [,3]   
[1,] "A 3 f" "A 4 f" "A 5 f"
[2,] "B 3 f" "B 4 f" "B 5 f"

, , 2

     [,1]    [,2]    [,3]   
[1,] "A 3 g" "A 4 g" "A 5 g"
[2,] "B 3 g" "B 4 g" "B 5 g"

Creating nonsense words - using outer() with three+ dimensions in R

You could create all different combinations with expand grid:

apply(expand.grid(initial_consonants, vowels, final_consonants), 1, function(x)create_CVC_words(x[1], x[2], x[3]))

Does this do what you want?

How does inner product generalize to higher-dimensional arrays?

Suppose I have two conformable matrices and want to do an inner product.

      a ← 5 2 ⍴ ⍳ 10
      b ← 2 6 ⍴ ⍳ 10
      a
1  2
3  4
5  6
7  8
9 10
      b
1 2 3  4 5 6
7 8 9 10 1 2
      a +.= b
1 0 0 0 0 1
0 0 1 0 0 0
0 0 0 0 1 0
0 1 0 0 0 0
0 0 0 1 0 0
      a +.× b
15 18  21  24  7 10
31 38  45  52 19 26
47 58  69  80 31 42
63 78  93 108 43 58
79 98 117 136 55 74

What's important here is that the trailing dimension of a, that is, ¯1↑⍴a, matches the leading dimension of b, or 1↑⍴b. Similarly, the shape of the result is the catenation of the shapes of both arguments, less the trailing dimension of a and the leading dimension of b, or (¯1↓⍴a),1↓⍴b.

Suppose now I had higher-dimensional arrays, then the same rules would apply. The trailing dimension of a must match the leading dimension of b etc.

The non-obvious generalisation to higher dimensions is that this operation is no different than an inner product of two matrices, provided you collapse the relevant dimensions.

      a ← 5 1 2 1 2 1 2 ⍴ ⍳ 40 
      b ← 2 3 4 5 ⍴ ⍳ 120

To collapse the dimensions, simply multiply together all but the last dimension of a, and all but the first dimension of b.

      a1 ← 20 2 ⍴ ⍳ 40
      b1 = 2 60 ⍴ ⍳ 120

Do the operation

      r1 ← a1 +.× b1

Lastly, put the collapsed dimensions back.

      r ← 5 1 2 1 2 1 3 4 5 ⍴ r1

Try it!

Apply function with outer taking the columns of two matrices as the elements of interest

You are pretty close. As described in this related question, all you need is the Vectorize() function to convert your Fun() function into a vectorized version:

VecFun <- Vectorize( Fun )

Then you can simply do:

outer(d.cols, r.cols, VecFun )

E.g. if you define

Fun <- function(a,b) sum(a+b)

and r,d matrices are defined as follows:

J <- 5
D <- 3
R <- 4

d <- matrix( 1:(J*D), J, D)
r <- matrix( 1:(J*R), J, R)

then you get this:

> outer(d.cols, r.cols, VecFun)

   1   2   3   4
1 30  55  80 105
2 55  80 105 130
3 80 105 130 155

Is there a way to form sparse n-dimensional array in Python3?

In the spirit of coo format I could generate a 3d sparse array representation:

In [106]: dims = 2,4,6
In [107]: data = np.zeros((10,4),int)
In [108]: data[:,-1] = 1
In [112]: for i in range(3):
     ...:     data[:,i] = np.random.randint(0,dims[i],10)

In [113]: data
Out[113]: 
array([[0, 2, 3, 1],
       [0, 3, 4, 1],
       [0, 0, 1, 1],
       [0, 3, 0, 1],
       [1, 1, 3, 1],
       [1, 0, 2, 1],
       [1, 1, 2, 1],
       [0, 2, 5, 1],
       [0, 1, 5, 1],
       [0, 1, 2, 1]])

Does that meet your requirements? It's possible there are some duplicates. sparse.coo sums duplicates before it converts the array to dense for display, or to csr for calculations.

The corresponding dense array is:

In [130]: A=np.zeros(dims, int)
In [131]: for row in data:
     ...:     A[tuple(row[:3])] += row[-1]

In [132]: A
Out[132]: 
array([[[0, 1, 0, 0, 0, 0],
        [0, 0, 1, 0, 0, 1],
        [0, 0, 0, 1, 0, 1],
        [1, 0, 0, 0, 1, 0]],

       [[0, 0, 1, 0, 0, 0],
        [0, 0, 1, 1, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0]]])

(no duplicates in this case).

A 2d sparse matrix using a subset of this data is

In [118]: sparse.coo_matrix((data[:,3],(data[:,1],data[:,2])),(4,6)).A
Out[118]: 
array([[0, 1, 1, 0, 0, 0],
       [0, 0, 2, 1, 0, 1],
       [0, 0, 0, 1, 0, 1],
       [1, 0, 0, 0, 1, 0]])

That's in effect the sum over the first dimension.

I'm assuming that

M[dim1,dim2,dim3,...] = 1.0

means the non-zero elements of the array must have a data value of 1.

Pandas has a sparse data series and data frame format. That allows for a non-zero 'fill' value. I don't know if the multi-index version can be thought of as higher than 2d or not. There have been a few SO questions about converting the Pandas sparse arrays to/from the scipy sparse.

Convert Pandas SparseDataframe to Scipy sparse csc_matrix

http://pandas-docs.github.io/pandas-docs-travis/sparse.html#interaction-with-scipy-sparse

Outer product in tensorflow

Yes, you can do this by taking advantage of the broadcast semantics of tensorflow. Size the first out to size 1xN of itself, and the second to size Mx1 of itself, and you'll get a broadcast to MxN of all of the results when you multiply them.

(You can play around with the same thing in numpy to see how it behaves in a simpler context, btw:

a = np.array([1, 2, 3, 4, 5]).reshape([5,1])
b = np.array([6, 7, 8, 9, 10]).reshape([1,5])
a*b

How exactly you do it in tensorflow depends a bit on which axes you want to use and what semantics you want for the resulting multiply, but the general idea applies.

Outer/tensor product in R

It will be hard to beat the performance of outer. This ends up doing a matrix multiplication which is done by the BLAS library. Calling outer repeatedly doesn't matter either, since the last call will dominate both speed and memory wise. For example, for vectors of length 100, the last call is at least 100x slower than the previous one...

Your best bet to get the best performance here is to get the best BLAS library for R. The default one isn't very good. On Linux, you can fairly easily configure R to use ATLAS BLAS. On Windows it is harder, but possible. See R for Windows FAQ.

# multiple outer
mouter <- function(x1, ...) { 
    r <- x1
    for(vi in list(...)) r <- outer(r, vi)
    r
}

# Your example
d=3
x1 = 1:d
x2 = 1:d+3
x3 = 1:d+6 
mouter(x1,x2,x3)

# Performance test
x <- runif(1e2)
system.time(mouter(x,x,x))   # 0 secs (less than 10 ms)
system.time(mouter(x,x,x,x)) # 0.5 secs / 0.35 secs (better BLAS)

I replaced my Windows Rblas.dll with the DYNAMIC_ARCH version of GOTO BLAS at this place which improved the time from 0.5 to 0.35 secs as seen above.

How to Generalize Outer to N Dimensions