Is Rebasing Dlls (Or Providing an Appropriate Default Load Address) Worth the Trouble

Is rebasing DLLs (or providing an appropriate default load address) worth the trouble?

I'd like to provide one answer myself, although the answers of Hans Passant and others are describing the tradeoffs already pretty well.

After recently fiddling with DLL base addresses in our application, I will here give my conclusion:

I think that, unless you can prove otherwise, providing DLLs with a non-default Base Address is an exercise in futility. This includes rebasing my DLLs.

  • For the DLLs I control, given the average application, each DLL will be loaded into memory only once anyway, so the load on the paging file should be minimal. (But see the comment of Michal Burr in another answer about Terminal Server environment.)

  • If DLLs are provided with a fixed base address (without rebasing) it will actually increase address space fragmentation, as sooner or later these addresses won't match anymore. In our app we had given all DLLs a fixed base address (for other legacy reasons, and not because of address space fragmentation) without using rebase.exe and this significantly increased address space fragmentation for us because you really can't get this right manually.

  • Rebasing (via rebase.exe) is not cheap. It is another step in the build process that has to be maintained and checked, so it has to have some benefit.

  • A large application will always have some DLLs loaded where the base address does not match, because of some hook DLLs (AV) and because you don't rebase 3rd party DLLs (or at least I wouldn't).

  • If you're using a RAM disk for the paging file, you might actually be better of if loaded DLLs get paged out :-)

So to sum up, I think that rebasing isn't worth the trouble except for special cases like the system DLLs.


I'd like to add a historical piece that I found on Old New Thing: How did Windows 95 rebase DLLs? --

When a DLL needed to be rebased, Windows 95 would merely make a note
of the DLL's new base address, but wouldn't do much else. The real
work happened when the pages of the DLL ultimately got swapped in. The
raw page was swapped off the disk, then the fix-ups were applied on
the fly to the raw page, thereby relocating it. The fixed-up page was
then mapped into the process's address space and the program was
allowed to continue.

Looking at how this process is done (read the whole thing), I personally suspect that part of the "rebasing is evil" stance dates back to the olden days of Win9x and low memory conditions.


Look, now there's a non-historical piece on Old New Thing:

How important is it nowadays to ensure that all my DLLs have non-conflicting base addresses?


Back in the day, one of the things you were exhorted to do was rebase
your DLLs so that they all had nonoverlapping address ranges, thereby
avoiding the cost of runtime relocation. Is this still important
nowadays?

...

In the presence of ASLR, rebasing your DLLs has no effect because ASLR is going to ignore your base address anyway and relocate the DLL into a location of its pseudo-random choosing.

...

Conclusion: It doesn't hurt to rebase, just in case, but understand
that the payoff will be extremely rare. Build your DLL with
/DYNAMICBASE enabled (and with /HIGHENTROPYVA for good measure)
and let ASLR do the work of ensuring that no base address collision
occurs. That will cover pretty much all of the real-world scenarios.
If you happen to fall into one of the very rare cases where ASLR is
not available, then your program will still work. It just may run a
little slower due to the relocation penalty.

... ASLR actually does a better job of avoiding collisions than manual
rebasing, since ASLR can view the system as a whole, whereas manual
rebasing requires you to know all the DLLs that are loaded into your
process, and coordinating base addresses across multiple vendors is
generally not possible.

.NET assemblies and DLL rebasing

CLR Loading mechanism uses LoadLibrary behind the scenes, so this is what you observe: 2 assemblies can't be loaded at the same address. Now what people often mean when they try to rebase a dll is to avoid the perf. hit of fix-ups, e.g. absolute addresses & function calls need to be "relocated" with the loaded base-address. CLR does not have this problem (not sure about static data in the application, which is the second part of those fix-ups, I would need to read up on this), because MSIL code is loaded on-demand when you call a function in the managed code. The MSIL then gets jitted and placed on the heap, a different one from normal object heap I believe, in the same manner CLR allocates and lays out new objects in your application.

rebasing DLLs for debug purpose, anyone doing it?

Yes it's still well worth rebasing your DLL so that it doesn't clash with other DLLs in the host process. I won't list the benefits since you do so in the links in your question!

Does ASLR cause a slow loading of Dlls?

The short answer is no.

On a system without ASLR (e.g. XP), loading a DLL at a non-preferred address has several costs:

  1. The relocations section has to be parsed and fixups have to be applied to the entire image.
  2. The act of applying fixups causes copy-on-write faults which are relatively expensive CPU-wise, and also force pages to be read from disk even if they are not referenced by the app itself.
  3. Every process that loads the DLL at a non-preferred address gets a private copy of every page that is written to, leading to increased memory usage.

Items 2 and 3 are by far the biggest costs, and are the main reason why manually rebasing DLLs used to be necessary.

With ASLR, fixups are applied transparently by the OS, making it look like the DLL was actually loaded at its preferred address. There are no copy-on-write faults, and no process-private pages are created. Also, fixups are applied only to the pages that are actually accessed by the app, rather than the entire image, which means no extra data is read from disk.

In addition to that, manual rebasing schemes can't prevent all base address conflicts (for example, DLLs from different vendors can conflict with each other, or an OS DLL could increase in size due to a hotfix and spill over into a range reserved for some other DLL, etc.). ASLR is a lot more efficient at dealing with these issues, so when looking at the system as a whole it can actually improve performance.



Related Topics



Leave a reply



Submit