Implement MATLAB's im2col 'sliding' in Python
Approach #1
We could use some broadcasting
here to get all the indices of all those sliding windows in one go and thus with indexing achieve a vectorized solution
. This is inspired by Efficient Implementation of im2col and col2im
.
Here's the implementation -
def im2col_sliding_broadcasting(A, BSZ, stepsize=1):
# Parameters
M,N = A.shape
col_extent = N - BSZ[1] + 1
row_extent = M - BSZ[0] + 1
# Get Starting block indices
start_idx = np.arange(BSZ[0])[:,None]*N + np.arange(BSZ[1])
# Get offsetted indices across the height and width of input array
offset_idx = np.arange(row_extent)[:,None]*N + np.arange(col_extent)
# Get all actual indices & index into input array for final output
return np.take (A,start_idx.ravel()[:,None] + offset_idx.ravel()[::stepsize])
Approach #2
Using newly gained knowledge of NumPy array strides
that lets us create such sliding windows, we would have another efficient solution -
def im2col_sliding_strided(A, BSZ, stepsize=1):
# Parameters
m,n = A.shape
s0, s1 = A.strides
nrows = m-BSZ[0]+1
ncols = n-BSZ[1]+1
shp = BSZ[0],BSZ[1],nrows,ncols
strd = s0,s1,s0,s1
out_view = np.lib.stride_tricks.as_strided(A, shape=shp, strides=strd)
return out_view.reshape(BSZ[0]*BSZ[1],-1)[:,::stepsize]
Approach #3
The strided method listed in the previous approach has been incorporated into scikit-image
module for a less messier, like so -
from skimage.util import view_as_windows as viewW
def im2col_sliding_strided_v2(A, BSZ, stepsize=1):
return viewW(A, (BSZ[0],BSZ[1])).reshape(-1,BSZ[0]*BSZ[1]).T[:,::stepsize]
Sample runs -
In [106]: a # Input array
Out[106]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]])
In [107]: im2col_sliding_broadcasting(a, (2,3))
Out[107]:
array([[ 0, 1, 2, 5, 6, 7, 10, 11, 12],
[ 1, 2, 3, 6, 7, 8, 11, 12, 13],
[ 2, 3, 4, 7, 8, 9, 12, 13, 14],
[ 5, 6, 7, 10, 11, 12, 15, 16, 17],
[ 6, 7, 8, 11, 12, 13, 16, 17, 18],
[ 7, 8, 9, 12, 13, 14, 17, 18, 19]])
In [108]: im2col_sliding_broadcasting(a, (2,3), stepsize=2)
Out[108]:
array([[ 0, 2, 6, 10, 12],
[ 1, 3, 7, 11, 13],
[ 2, 4, 8, 12, 14],
[ 5, 7, 11, 15, 17],
[ 6, 8, 12, 16, 18],
[ 7, 9, 13, 17, 19]])
Runtime test
In [183]: a = np.random.randint(0,255,(1024,1024))
In [184]: %timeit im2col_sliding(img, (8,8), skip=1)
...: %timeit im2col_sliding_broadcasting(img, (8,8), stepsize=1)
...: %timeit im2col_sliding_strided(img, (8,8), stepsize=1)
...: %timeit im2col_sliding_strided_v2(img, (8,8), stepsize=1)
...:
1 loops, best of 3: 1.29 s per loop
1 loops, best of 3: 226 ms per loop
10 loops, best of 3: 84.5 ms per loop
10 loops, best of 3: 111 ms per loop
In [185]: %timeit im2col_sliding(img, (8,8), skip=4)
...: %timeit im2col_sliding_broadcasting(img, (8,8), stepsize=4)
...: %timeit im2col_sliding_strided(img, (8,8), stepsize=4)
...: %timeit im2col_sliding_strided_v2(img, (8,8), stepsize=4)
...:
1 loops, best of 3: 1.31 s per loop
10 loops, best of 3: 104 ms per loop
10 loops, best of 3: 84.4 ms per loop
10 loops, best of 3: 109 ms per loop
Around 16x
speedup there with the strided method over the original loopy version!
Implement MATLAB's col2im 'sliding' in Python
For im2col
'sliding'
version, we already have few efficient implementations as posted here
.
For col2im
'sliding'
version, it would just be a reshape away -
def col2im_sliding(B, block_size, image_size):
m,n = block_size
mm,nn = image_size
return B.reshape(nn-n+1,mm-m+1).T
# Or simply B.reshape(nn-n+1,-1).T
# Or B.reshape(mm-m+1,nn-n+1,order='F')
Test runs
MATLAB run -
>> arr = 1:15;
>> arr
arr =
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
>> col2im(arr, [3,3], [7,5],'sliding')
ans =
1 6 11 %// (7-3+1, 5-3+1) shaped array
2 7 12
3 8 13
4 9 14
5 10 15
NumPy/Python run -
In [116]: arr
Out[116]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
In [117]: col2im_sliding(arr, [3,3], [7,5])
Out[117]:
array([[ 1, 6, 11], # (7-3+1, 5-3+1) shaped array
[ 2, 7, 12],
[ 3, 8, 13],
[ 4, 9, 14],
[ 5, 10, 15]])
Setup sliding windows as columns (IM2COL from MATLAB) in multi-dimensional array - Python
We can leverage np.lib.stride_tricks.as_strided
based scikit-image's view_as_windows
to get sliding windows. More info on use of as_strided
based view_as_windows
.
from skimage.util.shape import view_as_windows
W1,W2 = 2,2 # window size
# create sliding windows along last two axes1
w = view_as_windows(arr,(1,1,W1,W2))[...,0,0,:,:]
# Merge the window axes (tha last two axes) and
# merge the axes along which those windows were created (3rd and 4th axes)
outshp = arr.shape[:-2] + (W1*W2,) + ((arr.shape[-2]-W1+1)*(arr.shape[-1]-W2+1),)
out = w.transpose(0,1,4,5,2,3).reshape(outshp)
The last step forces a copy. So, skip it if possible.
Efficient Implementation of `im2col` and `col2im`
I can only hope that Mathworks guys don't sue you or me or Stackoverflow
for that matter, trying to create vectorized implementations of their IP toolbox functions, as they have put price on that toolbox. But in any case, forgetting those issues, here are the implementations.
Replacement for im2col
with 'sliding'
option
I wasn't able to vectorize this until I sat down to write solution to another problem on Stackoverflow. So, I would strongly encourage to look into it too.
function out = im2col_sliding(A,blocksize)
nrows = blocksize(1);
ncols = blocksize(2);
%// Get sizes for later usages
[m,n] = size(A);
%// Start indices for each block
start_ind = reshape(bsxfun(@plus,[1:m-nrows+1]',[0:n-ncols]*m),[],1); %//'
%// Row indices
lin_row = permute(bsxfun(@plus,start_ind,[0:nrows-1])',[1 3 2]); %//'
%// Get linear indices based on row and col indices and get desired output
out = A(reshape(bsxfun(@plus,lin_row,[0:ncols-1]*m),nrows*ncols,[]));
return;
Replacement for im2col
with 'distinct'
option
function out = im2col_distinct(A,blocksize)
nrows = blocksize(1);
ncols = blocksize(2);
nele = nrows*ncols;
row_ext = mod(size(A,1),nrows);
col_ext = mod(size(A,2),ncols);
padrowlen = (row_ext~=0)*(nrows - row_ext);
padcollen = (col_ext~=0)*(ncols - col_ext);
A1 = zeros(size(A,1)+padrowlen,size(A,2)+padcollen);
A1(1:size(A,1),1:size(A,2)) = A;
t1 = reshape(A1,nrows,size(A1,1)/nrows,[]);
t2 = reshape(permute(t1,[1 3 2]),size(t1,1)*size(t1,3),[]);
t3 = permute(reshape(t2,nele,size(t2,1)/nele,[]),[1 3 2]);
out = reshape(t3,nele,[]);
return;
Some quick tests show that both these implementations particularly sliding
one for small to decent sized input data and distinct
for all datasizes perform much better than the in-built MATLAB function implementations in terms of runtime performance.
How to use
With in-built MATLAB function -
B = im2col(A,[nrows ncols],'sliding')
With our custom function -
B = im2col_sliding(A,[nrows ncols])
%// ------------------------------------
With in-built MATLAB function -
B = im2col(A,[nrows ncols],'distinct')
With our custom function -
B = im2col_distinct(A,[nrows ncols])
Overlapping sliding window over an image using blockproc or im2col?
OK, this is quite a complex question. I'll try and break this up into separate parts and will answer each question separately.
Question #1
blockproc
can be used to implement my function on a sliding window using theBorderSize
andTrimBorder
arguments.
B = blockproc(A,[64,64],fun,'BorderSize',[5,5], 'TrimBorder', 'false');
I realize that this creates a block of
[64 + 2*5, 64 + 2*5]
and applies the function@fun
on each block. But since I cannot go into my function@fun
in debugging to verify proper operation I cannot be sure this is what I need. Is my above code correct for what I need? I know that I get a concatenated result inB
but it should be on a overlapping sliding block. Will this achieve what I need?
After experimenting around with blockproc
, this is indeed correct where you can use it to get sliding neighbourhood processing working. However, you're going to need an additional flag, which is PadPartialBlocks
. The purpose of this flag is so that if you are extracting a block where you're at the outer edges of the image and you can't make a block of a specified size, this will zero-pad this partial block so that it conforms to the same size. Here's a small example to get this working with sliding windows. Supposing we had a matrix such that:
>> A = reshape(1:25,5,5)
A =
1 6 11 16 21
2 7 12 17 22
3 8 13 18 23
4 9 14 19 24
5 10 15 20 25
Let's say we wanted to take the average of each 3 x 3 overlapping neighbourhood in the matrix above, and zero-padding those elements that go beyond the borders of the matrix. You would do this with blockproc
:
B = blockproc(A, [1 1], @(x) mean(x.data(:)), 'BorderSize', [1 1], 'TrimBorder', false, 'PadPartialBlocks', true);
What's important to note is that the block size, which is 1 x 1 in this case and BorderSize
which is 1 x 1 as well are set differently than what you'd expect for a 3 x 3 block. To go into why this is the case, we need some further insight on how BorderSize
works. For a given centre of a block, BorderSize
allows you to capture values / pixels beyond the dimensions of the originally sized block. For those locations that go beyond the borders of the matrix, we would pad these locations as zero by default. BorderSize
allows us to capture 2M + 2N
pixels more, where M
and N
are the horizontal and vertical border size you want. This would allow us to capture M
more pixels both above and below the original block and N
more pixels to the left and right of the original block.
Therefore, for the value of 1 in A
, if the block size is 1 x 1, this means that the element consists of only 1, and if our BorderSize
was 1 x 1. This means our final block would be:
0 0 0
0 1 6
0 2 7
Because our block size is 1, the next block would be centred at 6, and we would get a 3 x 3 grid of pixels and so on. It is also important that TrimBorder
is set to false
so that we can keep those pixels that were originally captured upon expansion of the block. The default is set to true
. Finally, PadPartialBlocks
is true
to ensure that all blocks are the same size. When you run the above code, the result we get is:
B =
1.7778 4.3333 7.6667 11.0000 8.4444
3.0000 7.0000 12.0000 17.0000 13.0000
3.6667 8.0000 13.0000 18.0000 13.6667
4.3333 9.0000 14.0000 19.0000 14.3333
3.1111 6.3333 9.6667 13.0000 9.7778
You can verify that we get the same result using nlfilter
where we can apply the mean to 3 x 3 sliding neighbourhoods:
C = nlfilter(A, [3 3], @(x) mean(x(:)))
C =
1.7778 4.3333 7.6667 11.0000 8.4444
3.0000 7.0000 12.0000 17.0000 13.0000
3.6667 8.0000 13.0000 18.0000 13.6667
4.3333 9.0000 14.0000 19.0000 14.3333
3.1111 6.3333 9.6667 13.0000 9.7778
As such, if you want to properly use blockproc
for sliding operations, you need to be careful on how you set the block size and border size respectively. In this case, the general rule is to always set your block size to be 1 x 1, and allow BorderSize
to specify the size of each block you want. Specifically, for a block of size K x K
, you would set BorderSize
to be floor(K/2) x floor(K/2)
respectively. It would make things easy if K
was odd.
For example, if you wanted a 5 x 5
mean filtering operation on a sliding window basis, you would set BorderSize
to [2 2]
, as K = 5
and floor(K/2) = 2
. Therefore, you would do this:
B = blockproc(A, [1 1], @(x) mean(x.data(:)), 'BorderSize', [2 2], 'TrimBorder', false, 'PadPartialBlocks', true)
B =
2.5200 4.5600 7.2000 6.9600 6.1200
3.6000 6.4000 10.0000 9.6000 8.4000
4.8000 8.4000 13.0000 12.4000 10.8000
4.0800 7.0400 10.8000 10.2400 8.8800
3.2400 5.5200 8.4000 7.9200 6.8400
Replicating this with nlfilter
with a size of 5 x 5 also gives:
C = nlfilter(A, [5 5], @(x) mean(x(:)))
C =
2.5200 4.5600 7.2000 6.9600 6.1200
3.6000 6.4000 10.0000 9.6000 8.4000
4.8000 8.4000 13.0000 12.4000 10.8000
4.0800 7.0400 10.8000 10.2400 8.8800
3.2400 5.5200 8.4000 7.9200 6.8400
I was doing some timing tests, and it seems that blockproc
used in this context is faster than nlfilter
.
Question #2
The second is
im2col
.im2col(A,[m n],block_type)
will divide the block into m by n blocks and arrange them in columns, so each column is a block? If so, how is the overlapping controlled? And if each block is a column can I successfully apply thedct2
function on each column? Because I doubt it will take vectors as input?
You are correct in that im2col
transforms each pixel neighbourhood or block into a single column and the concatenation of these columns forms the output matrix. You can control whether the blocks are overlapping or are distinct by the block_type
parameter. Specify distinct
or sliding
(which is default) to control this. You can also control the size of each neighbourhood with m
and n
.
However, if it is your goal to apply dct2
with the output of im2col
, then you will not get what you desire. Specifically, dct2
takes the spatial location of each data point within your 2D data into account and is used as part of the transform. By transforming each pixel neighbourhood into a single column, the 2D spatial relationships that were originally there for each block are now gone. dct2
expects 2D spatial data, but you would be specifying 1D data instead. As such, im2col
is probably not what you're looking for. If I understand what you want correctly, you'll want to use blockproc
instead.
Hope this helps!
The im2col algorithm for ND input
Let me just start by saying that im2col
is only intended for 2D matrices. The fact that it sometimes worked (and by that I mean returned a result without throwing an error) is just a happy coincidence.
Now I took a peek at edit im2col.m
, and without studying the code too much, the first line of each of the distinct
and sliding
methods should give you an intuition of what's happening:
...
if strcmp(kind, 'distinct')
[m,n] = size(a);
...
elseif strcmp(kind,'sliding')
[ma,na] = size(a);
...
end
...
First recall that [s1,s2] = size(arr)
where arr
is a 3d array will collapse the size of 2nd and 3rd dimension into one size. Here's the relevant doc size
:
[d1,d2,d3,...,dn] = size(X)
returns the sizes of the dimensions of the arrayX
, provided the number of output argumentsn
equalsndims(X)
. Ifn < ndims(X)
,di
equals the size of the ith dimension ofX
for0<i<n
, butdn
equals the product of the sizes of the remaining dimensions ofX
, that is, dimensionsn
throughndims(X)
.
So basically for an array of size M-by-N-by-P
, the function instead thinks it's a matrix of size M-by-(N*P)
. Now MATLAB has some quirky indexing rules that lets you do things like:
>> x = reshape(1:4*3*2,4,3,2)
x(:,:,1) =
1 5 9
2 6 10
3 7 11
4 8 12
x(:,:,2) =
13 17 21
14 18 22
15 19 23
16 20 24
>> x(:,:)
ans =
1 5 9 13 17 21
2 6 10 14 18 22
3 7 11 15 19 23
4 8 12 16 20 24
which is what I think ended up happening here. Here is an example to confirm the behavior of im2col
on an RGB image:
% normal case (grayscale image)
>> M = magic(5);
>> B1 = im2col(M, [3 3], 'sliding');
% (RGB image)
>> MM = cat(3, M, M+50, M+100);
>> B2 = im2col(MM, [3 3], 'sliding');
>> B3 = im2col(reshape(MM, [5 5*3]), [3 3], 'sliding');
>> assert(isequal(B2,B3))
Note that B2
and B3
are equal, so basically think of the result of im2col
on an array arr = cat(3,R,G,B)
to be the same as that of arr = cat(2,R,G,B)
(concatenated horizontally).
Interestingly, you won't get so lucky with "distinct" blocks method:
>> B1 = im2col(M, [3 3], 'distinct') % works
% ..snip..
>> B2 = im2col(MM, [3 3], 'distinct') % errors
Subscripted assignment dimension mismatch.
Error in im2col (line 59)
aa(1:m,1:n) = a;
Now that we understand what was happening, let's think how to do this properly for 3D arrays.
In my opinion to implement im2col
for color images, I would just run it on each color channel separately (each being a 2d matrix), and concatenate the result along the third dimension. So something like this wrapper function:
function B = im2col_rgb(img, sz, varargin)
B = cell(1,size(img,3));
for i=1:size(img,3)
B{i} = im2col(img(:,:,i), sz, varargin{:});
end
B = cat(3, B{:});
end
Related Topics
How to Use Pip to Install a Package from a Private Github Repository
How to Separate the Functions of a Class into Multiple Files
Python Beautifulsoup Extract Text Between Element
In Python What Is a Global Statement
How to Read Hdf5 Files in Python
How to Use Brew Installed Python as the Default Python
How to Do Assignments in a List Comprehension
Catching an Exception While Using a Python 'With' Statement
Compulsory Usage of If _Name_=="_Main_" in Windows While Using Multiprocessing
Check List of Words in Another String
Generating Matplotlib Graphs Without a Running X Server
How to Prevent Python's Urllib(2) from Following a Redirect
Ssl.Sslerror: [Ssl: Certificate_Verify_Failed] Certificate Verify Failed (_Ssl.C:749)
In Python, How to Convert Seconds Since Epoch to a 'Datetime' Object
Connecting to Microsoft SQL Server Using Python
Subprocess.Popen() Error (No Such File or Directory) When Calling Command with Arguments as a String
If Range() Is a Generator in Python 3.3, Why How to Not Call Next() on a Range