Where to Start Learning About Linux Dma/Device Drivers/Memory Allocation

Where to start learning about linux DMA / device drivers / memory allocation

On-line:

Anatomy of the Linux slab allocator

Understanding the Linux Virtual Memory Manager

Linux Device Drivers, Third Edition

The Linux Kernel Module Programming Guide

Writing device drivers in Linux: A brief tutorial

Books:

Linux Kernel Development (2nd Edition)

Essential Linux Device Drivers ( Only the first 4 - 5 chapters )

Useful Resources:

the Linux Cross Reference ( Searchable Kernel Source for all Kernels )

API changes in the 2.6 kernel series


dma_sync_single_for_device calls dma_sync_single_range_for_cpu a little further up in the file and this is the source documentation ( I assume that even though this is for arm the interface and behavior are the same ):

/**
380 * dma_sync_single_range_for_cpu
381 * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
382 * @handle: DMA address of buffer
383 * @offset: offset of region to start sync
384 * @size: size of region to sync
385 * @dir: DMA transfer direction (same as passed to dma_map_single)
386 *
387 * Make physical memory consistent for a single streaming mode DMA
388 * translation after a transfer.
389 *
390 * If you perform a dma_map_single() but wish to interrogate the
391 * buffer using the cpu, yet do not wish to teardown the PCI dma
392 * mapping, you must call this function before doing so. At the
393 * next point you give the PCI dma address back to the card, you
394 * must first the perform a dma_sync_for_device, and then the
395 * device again owns the buffer.
396 */

Where to learn more about low-level programming? e.g device drivers

I would recommend a microcontroller approach, msp430, avr, or one of the many arms. You can get an msp430 or one of the arm launchpads from ti for about 10-20 bucks, you can find stm32 based discovery boards for 10-20 bucks as well, the stm32f0 (cortex-m0) is simple, basic. The stm32f4 is loaded with stuff like caches and an fpu, but still a microcontroller.

I think the microcontroller approach will take you back to the roots, very basic stuff, the hardware doesnt have video usually or hard drives or anything, but you do learn how to master the tools (compiler, linker, etc) and how to read the manuals, etc.

You could just fast forward to linux capable hardware like the raspberry pi or beaglebone. At all levels but in particular the linux capable levels, in addition to the manuals for the video chips, usb chips, etc you also will want to use the linux source code to learn from. Understand that the linux drivers are often very generic and often contain code to drive a whole family or whole history of hardware from one vendor, so a bunch of the code may not match anything in the manual, but some will and that is where you will get over the hump of understanding what the manual is saying. Doesnt mean you have to re-write linux in order to use this information.

Start small take it one step at a time, pcie and video alone from scratch is a major task, if you start there you are quite likely to fail unless you already have the knowledge and experience (if you did you wouldnt be here asking).

An approach you can also take is bochs or something that emulates an old 8086 DOS system, and if you can find some good, old, dos/8086 based manuals you can poke at regisiters, cga mode, vga mode, etc. The disc controllers can be difficult still, etc...

There are plenty of arm, avr, msp430, mips, etc (and x86 but I wouldnt go there for this until later) instruction set simulators that would avoid the cost of hardware and also avoid the cost of damaging the hardware and having to buy more (it happens even with lots of experience), even better writing your own instruction set simulator...vba some nds simulators, working your way up to qemu.

Linux kernel device driver to DMA from a device into user-space memory

I'm actually working on exactly the same thing right now and I'm going the ioctl() route. The general idea is for user space to allocate the buffer which will be used for the DMA transfer and an ioctl() will be used to pass the size and address of this buffer to the device driver. The driver will then use scatter-gather lists along with the streaming DMA API to transfer data directly to and from the device and user-space buffer.

The implementation strategy I'm using is that the ioctl() in the driver enters a loop that DMA's the userspace buffer in chunks of 256k (which is the hardware imposed limit for how many scatter/gather entries it can handle). This is isolated inside a function that blocks until each transfer is complete (see below). When all bytes are transfered or the incremental transfer function returns an error the ioctl() exits and returns to userspace

Pseudo code for the ioctl()

/*serialize all DMA transfers to/from the device*/
if (mutex_lock_interruptible( &device_ptr->mtx ) )
return -EINTR;

chunk_data = (unsigned long) user_space_addr;
while( *transferred < total_bytes && !ret ) {
chunk_bytes = total_bytes - *transferred;
if (chunk_bytes > HW_DMA_MAX)
chunk_bytes = HW_DMA_MAX; /* 256kb limit imposed by my device */
ret = transfer_chunk(device_ptr, chunk_data, chunk_bytes, transferred);
chunk_data += chunk_bytes;
chunk_offset += chunk_bytes;
}

mutex_unlock(&device_ptr->mtx);

Pseudo code for incremental transfer function:

/*Assuming the userspace pointer is passed as an unsigned long, */
/*calculate the first,last, and number of pages being transferred via*/

first_page = (udata & PAGE_MASK) >> PAGE_SHIFT;
last_page = ((udata+nbytes-1) & PAGE_MASK) >> PAGE_SHIFT;
first_page_offset = udata & PAGE_MASK;
npages = last_page - first_page + 1;

/* Ensure that all userspace pages are locked in memory for the */
/* duration of the DMA transfer */

down_read(¤t->mm->mmap_sem);
ret = get_user_pages(current,
current->mm,
udata,
npages,
is_writing_to_userspace,
0,
&pages_array,
NULL);
up_read(¤t->mm->mmap_sem);

/* Map a scatter-gather list to point at the userspace pages */

/*first*/
sg_set_page(&sglist[0], pages_array[0], PAGE_SIZE - fp_offset, fp_offset);

/*middle*/
for(i=1; i < npages-1; i++)
sg_set_page(&sglist[i], pages_array[i], PAGE_SIZE, 0);

/*last*/
if (npages > 1) {
sg_set_page(&sglist[npages-1], pages_array[npages-1],
nbytes - (PAGE_SIZE - fp_offset) - ((npages-2)*PAGE_SIZE), 0);
}

/* Do the hardware specific thing to give it the scatter-gather list
and tell it to start the DMA transfer */

/* Wait for the DMA transfer to complete */
ret = wait_event_interruptible_timeout( &device_ptr->dma_wait,
&device_ptr->flag_dma_done, HZ*2 );

if (ret == 0)
/* DMA operation timed out */
else if (ret == -ERESTARTSYS )
/* DMA operation interrupted by signal */
else {
/* DMA success */
*transferred += nbytes;
return 0;
}

The interrupt handler is exceptionally brief:

/* Do hardware specific thing to make the device happy */

/* Wake the thread waiting for this DMA operation to complete */
device_ptr->flag_dma_done = 1;
wake_up_interruptible(device_ptr->dma_wait);

Please note that this is just a general approach, I've been working on this driver for the last few weeks and have yet to actually test it... So please, don't treat this pseudo code as gospel and be sure to double check all logic and parameters ;-).

Rx descriptor dma mapping in device driver what that actually means, Does it mean mapping of packets at physical NIC to struct object in memory

DMA accesses are translated by the IOMMU, which on Intel systems is described in the Intel® Virtualization Technology for Directed I/O (VT-d) specification.

The function dma_alloc_coherent allocates memory and introduces a mapping into the DMA page tables so that the memory is accessible to the device. The address returned is not the virtual or physical address of the memory, but is a I/O virtual address (IOVA), which the device can use to access memory. The IOVA is translated by the IOMMU into the physical memory address when the device performs DMA.

The IOMMU prevents any device from accessing physical memory that hasn't been mapped into that specific device's I/O address space.



Related Topics



Leave a reply



Submit