In this post I will provide 5 tips for using memory efficiently in kernel-mode drivers for the Microsoft Windows family of operating systems.
Using memory wisely can help improve driver performance. Here are 5 tips for efficient memory use.
1.Lay out data structures efficiently and reuse them when possible
When designing your driver, plan your memory allocations according to type of memory, size, and lifetime. Combine allocations of similar lifetimes, so that you can free unused memory as soon as it is no longer needed. Don’t mix structures of greatly different sizes in the same allocation unless you can be sure that they will be aligned appropriately.
Reuse structures instead of freeing them and later reallocating memory for other uses. Reusing structures avoids additional reallocations and can help prevent fragmentation of the memory pool.
Drivers often require additional memory while handling I/O requests. A driver might allocate a memory descriptor list (MDL) or internal buffer to use for a specific I/O request or might need to allocate an IRP to send to lower drivers. The size of these structures varies depending on the request. The size of an MDL, for example, depends on the size of the buffer it describes.
If your driver has a technique to limit I/O size or to split up a large I/O request, you could make the buffer a fixed size, thus fixing the size of the MDL and making the buffer reusable.
Keep in mind that all performance issues involve tuning and balance. As a general rule, you should optimize for the most frequent operations, and not for unusually large or small requests that rarely occur.
2.Allocate nonpaged pool memory for long-term use at start-up
Drivers normally use nonpaged pool memory for long-term I/O buffers. Because nonpaged pool becomes fragmented as the system runs, drivers should preallocate memory that they will require for long-term structures and deallocate it when the device is removed. For example, a driver that always performs DMA, creates several events, and uses a lookaside list should allocate memory for those objects at startup in a DriverEntry or AddDevice routine and free the memory as part of handling the device removal request.
The driver should not, however, preallocate excessively large blocks of memory (several megabytes, for example) and try to manage its own allocations within that block.
Appropriate memory allocation routines include ExAllocatePoolWithTag, ExAllocatePoolWithQuotaTag, ExAllocatePoolWithTagPriority and AllocateCommonBuffer (if the driver's device uses bus-master DMA or a system DMA controller's auto-initialize mode).
Drivers should use the tagged versions of the pool allocation routines instead of the nontagged versions, which are obsolete. WinDbg and numerous testing tools use the tags to track memory allocation. Tagging pool allocations can help you more easily find memory-related bugs.
3.Use memory economically
Nonpaged pool memory is a limited system resource. Drivers should allocate I/O buffers as economically as possible. In general, avoid calling the memory allocation support routines repeatedly to request allocations of less than PAGE_SIZE. If your driver normally uses several related structures together, consider bundling those structures into a single allocation. For example, the SCSI port driver bundles an IRP, a SCSI request block (SRB), and an MDL into a single allocation.
Drivers that use DMA are an exception. If a driver that performs DMA needs several one-page buffers, but the buffers need not be contiguous, it should call AllocateCommonBuffer once for each such buffer. This approach conserves contiguous address space and improves the chances that the memory allocations will succeed.
In addition, consider whether the memory allocation routines you plan to use round the allocation request up to the next page boundary.
If the driver requests fewer than PAGE_SIZE bytes, ExAllocatePoolWithTag allocates the number of bytes requested. If the driver requests PAGE_SIZE or greater bytes, ExAllocatePoolWithTag allocates a page-aligned buffer that is an integral multiple of PAGE_SIZE bytes. Memory allocations of less than PAGE_SIZE do not cross page boundaries and are not necessarily page-aligned; instead, they are aligned on an 8-byte boundary.
AllocateCommonBuffer always allocates at least a page of memory. If the driver requests less than an integral multiple of PAGE_SIZE bytes, the remaining bytes on the last page are inaccessible to the driver.
4.Use lookaside lists
Lookaside lists provide fixed-size, reusable buffers. They are designed for structures that a driver might need to allocate dynamically and in unpredictable numbers.
Lookaside lists can be allocated from paged pool or nonpaged pool. The driver defines the layout and contents of the entries in the list to suit its requirements, and the system maintains list status and adjusts the number of available entries according to demand.
A driver calls ExInitialize[N]PagedLookasideList to set up a lookaside list, ExAllocateFrom[N]PagedLookasideList to allocate an entry in the list, and ExFreeTo[N]PagedLookasideList to free an entry in the list. The head of the list must be allocated from nonpaged memory, even if the list entries themselves are in paged memory.
5.Avoid frequently mapping and unmapping the virtual address space
Frequently mapping and unmapping the virtual address space can decrease performance system-wide because it can result in frequent flushes of the translation lookaside buffer (TLB), a per-processor cache of virtual-to-physical address translations. Each entry in the TLB contains a page table entry (PTE).
Every time the system translates a virtual address that references a new page, it adds an entry to the TLB. Once the TLB is full, the system must drop an existing entry every time it must add a new entry. Subsequently, each time a caller remaps or unmaps the address space, thus changing a PTE, the system must interrupt all CPUs so that it can update any TLB entries that contain the PTE.
Internally, the I/O manager avoids this problem for the MDL in Irp->MdlAddress . The first time a kernel-mode component calls MmGetSystemAddressForMdlSafe , the I/O manager stores the system address in the MDL along with the corresponding physical address. When the IRP returns to the I/O manager after completion,, the I/O manager unmaps the MDL. Thus, the I/O manager requires only a single mapping (and a single virtual to physical address translation) for each I/O request.
Search This Blog
Subscribe to:
Post Comments (Atom)
How to make a PayPal ACC without the need to verify anything
If you are from North America you have to face prompt of Paypal that ask you for your SSN (Social Security Number). This is very Easy metho...
-
What does KYC mean? Know Your Customer (KYC) standards are designed to protect financial institutions against fraud, corruption, money laund...
-
You can find out whether your system is installed with a 32-bit or 64-bit (Microsoft labels them as x86 or x64) based operating system, whic...
No comments:
Post a Comment