Using Dma Memory Transfer in User-Space

Using DMA Memory Transfer in User-Space

  • Linux's API for DMA doesn't permit memory to memory transfers. It's only for communication between devices and memory. Look in Documentation/DMA-API.txt for more details.

  • At hardware level, the x86 DMA controller doesn't allow memory to memory transfers. It's been discussed here: DMA transfer RAM-to-RAM

  • Given that the memory bus is usually slower than the CPU, what benefit would it have to launch a kernel driven memory copy ? You'd still have to wait for the transfer to finish and its duration would still be the determined by the memory bandwidth, exactly as with a CPU driven copy.

  • If your program's performance solely depends on memory to memory copy performance, it means that it can be probably be strongly improved by avoiding copy as much as possible, or by implementing a smarter procedure such as copy on write.

DMA transfer form kernel to user space

You probably want to implement mmap method of struct file_operations. Consider:

static int
sample_drv_mem_mmap(struct file *filep, struct vm_area_struct *vma)
{
/*
* Set your "dev" pointer here (the one you used
* for dma_alloc_coherent() invocation)
*/
struct device *dev;

/*
* Set DMA address here (the one you obtained with
* dma_alloc_coherent() via its third argument)
*/
dma_addr_t dma_addr;

/* Set your DMA buffer size here */
size_t dma_size;

/* Physical page frame number to be derived from "dma_addr" */
unsigned long pfn;

/* Check the buffer size requested by the user */
if (vma->vm_end - vma->vm_start > dma_size)
return -EINVAL;

/*
* For the sake of simplicity, do not let the user specify an offset;
* you may want to take care of that in later versions of your code
*/
if (vma->vm_pgoff != 0)
return -EINVAL;

pfn = PHYS_PFN(dma_to_phys(dev, dma_addr));

return remap_pfn_range(vma, vma->vm_start, pfn,
vma->vm_end - vma->vm_start,
vma->vm_page_prot);
}

/* ... */

static const struct file_operations sample_drv_fops = {
/* ... */

.mmap = sample_drv_mem_mmap,

/* ... */
};

Long story short, the idea is to convert the DMA (bus) address you have to a kernel physical address and then use remap_pfn_range() to do the actual mapping between the kernel and the userland.

In the user application, one should invoke mmap() to request the mapping (instead of the read / write approach) For more information on that, please refer to man 2 mmap on your system.

Linux kernel device driver to DMA from a device into user-space memory

I'm actually working on exactly the same thing right now and I'm going the ioctl() route. The general idea is for user space to allocate the buffer which will be used for the DMA transfer and an ioctl() will be used to pass the size and address of this buffer to the device driver. The driver will then use scatter-gather lists along with the streaming DMA API to transfer data directly to and from the device and user-space buffer.

The implementation strategy I'm using is that the ioctl() in the driver enters a loop that DMA's the userspace buffer in chunks of 256k (which is the hardware imposed limit for how many scatter/gather entries it can handle). This is isolated inside a function that blocks until each transfer is complete (see below). When all bytes are transfered or the incremental transfer function returns an error the ioctl() exits and returns to userspace

Pseudo code for the ioctl()

/*serialize all DMA transfers to/from the device*/
if (mutex_lock_interruptible( &device_ptr->mtx ) )
return -EINTR;

chunk_data = (unsigned long) user_space_addr;
while( *transferred < total_bytes && !ret ) {
chunk_bytes = total_bytes - *transferred;
if (chunk_bytes > HW_DMA_MAX)
chunk_bytes = HW_DMA_MAX; /* 256kb limit imposed by my device */
ret = transfer_chunk(device_ptr, chunk_data, chunk_bytes, transferred);
chunk_data += chunk_bytes;
chunk_offset += chunk_bytes;
}

mutex_unlock(&device_ptr->mtx);

Pseudo code for incremental transfer function:

/*Assuming the userspace pointer is passed as an unsigned long, */
/*calculate the first,last, and number of pages being transferred via*/

first_page = (udata & PAGE_MASK) >> PAGE_SHIFT;
last_page = ((udata+nbytes-1) & PAGE_MASK) >> PAGE_SHIFT;
first_page_offset = udata & PAGE_MASK;
npages = last_page - first_page + 1;

/* Ensure that all userspace pages are locked in memory for the */
/* duration of the DMA transfer */

down_read(¤t->mm->mmap_sem);
ret = get_user_pages(current,
current->mm,
udata,
npages,
is_writing_to_userspace,
0,
&pages_array,
NULL);
up_read(¤t->mm->mmap_sem);

/* Map a scatter-gather list to point at the userspace pages */

/*first*/
sg_set_page(&sglist[0], pages_array[0], PAGE_SIZE - fp_offset, fp_offset);

/*middle*/
for(i=1; i < npages-1; i++)
sg_set_page(&sglist[i], pages_array[i], PAGE_SIZE, 0);

/*last*/
if (npages > 1) {
sg_set_page(&sglist[npages-1], pages_array[npages-1],
nbytes - (PAGE_SIZE - fp_offset) - ((npages-2)*PAGE_SIZE), 0);
}

/* Do the hardware specific thing to give it the scatter-gather list
and tell it to start the DMA transfer */

/* Wait for the DMA transfer to complete */
ret = wait_event_interruptible_timeout( &device_ptr->dma_wait,
&device_ptr->flag_dma_done, HZ*2 );

if (ret == 0)
/* DMA operation timed out */
else if (ret == -ERESTARTSYS )
/* DMA operation interrupted by signal */
else {
/* DMA success */
*transferred += nbytes;
return 0;
}

The interrupt handler is exceptionally brief:

/* Do hardware specific thing to make the device happy */

/* Wake the thread waiting for this DMA operation to complete */
device_ptr->flag_dma_done = 1;
wake_up_interruptible(device_ptr->dma_wait);

Please note that this is just a general approach, I've been working on this driver for the last few weeks and have yet to actually test it... So please, don't treat this pseudo code as gospel and be sure to double check all logic and parameters ;-).

Creating physical memory from user space to use for DMA transfers

If I'm understanding you correctly, you have a device driver that's behaving poorly, and you're trying to work around that by manually allocating physical RAM from userspace? Is there a reason you're not interested in fixing the driver instead?

This sounds like a very odd request, not something that would be considered a proper fix by most people. I suspect you'd get more help if you were working on the underlying driver problem.

(copied from comment above.)



Related Topics



Leave a reply



Submit