Garbage collection in C# and Java
Usually in a function memory is allocated in the stack
for non-object variables and when a function completes its execution, the stack
is cleared and memory is freed.
For objects, memory is allocated in the heap
(You will remember malloc()
and free()
in C). But in Java and C# , the free()
function is what the garbage collector does for you instead of you worrying about it.
So even in functions, the objects are local variables but not stored in stack
but on heap
. So they are not the same as int i
. But when the function is completed, those objects are out of scope. So you will no longer have access to them but their memory is not freed until garbage collector
runs and clears them.
But how a garbage collector
runs, when it runs is all based on different algorithm. They may not be the same for even different implementations of Java (e.g sun java may have different algorithm than another Java implementation)
What are the fundamental differences between garbage collection in C# and Java?
The advice you've been given is, broadly speaking, a load of hooey.
Both C# and Java have GCs that attempt to optimise the fast recovery of lots of small objects. They're designed to solve the same problem, they do it in slightly different ways but as a user the technical differences in your approach to using them is minimal, even non-existent for the majority of users.
IDisposable
is nothing to do with the GC as such. It's a standard way of naming methods that would otherwise be called close
, destroy
, dispose
, etc., and often are called that in Java. There is a proposal for Java 7 to add something very similar to the using
keyword that would call a similar close
method.
"Destructors" in C# refers to finalizers - this was done deliberately to confuse C++ programmers. :) The CLR spec itself calls them finalizers, exactly as the JVM does.
There are many ways in which Java and C#/CLR differ (user value types, properties, generics and the whole family of related features known as Linq), but the GC is one area where you can develop a substantial amount of software before you need to worry much about the difference between them.
Is a garbage collector (.net/java) an issue for real-time systems?
To be precise, garbage collectors are a problem for real-time systems. To be even more precise, it is possible to write real-time software in languages that have automatic memory management.
More details can be found in the Real Time Specification for Java on one of the approaches for achieving real-time behavior using Java. The idea behind RTSJ is very simple - do not use a heap. RTSJ provides for new varieties of Runnable objects that ensure threads do not access heap memory of any kind. Threads can either access scoped memory (nothing unusual here; values are destroyed when the scope is closed) or immortal memory (that exists throughout the application lifetime). Variables in the immortal memory are written over, time and again with new values.
Through the use of immortal memory, RTSJ ensures that threads do not access the heap, and more importantly, the system does not have a garbage collector that preempts execution of the program by the threads.
More details are available in the paper "Project Golden Gate: Towards Real-Time Java in Space Missions" published by JPL and Sun.
Quantifying the Performance of Garbage Collection vs. Explicit Memory Management
You seem to be asking two things:
- have GC's improved since that research was performed, and
- can I use the conclusions of the paper as a formula to predict required memory.
The answer to the first is that there have been no major breakthroughs in GC algorithms that would invalidate the general conclusions:
- GC'ed memory management still requires significantly more virtual memory.
- If you try to constrain the heap size the GC performance drops significantly.
- If real memory is restricted, the GC'ed memory management approach results in substantially worse performance due to paging overheads.
However, the conclusions cannot really be used as a formula:
- The original study was done with JikesRVM rather than a Sun JVM.
- The Sun JVM's garbage collectors have improved in the ~5 years since the study.
- The study does not seem to take into account that Java data structures take more space than equivalent C++ data structures for reasons that are not GC related.
On the last point, I have seen a presentation by someone that talks about Java memory overheads. For instance, it found that the minimum representation size of a Java String is something like 48 bytes. (A String consists of two primitive objects; one an Object with 4 word-sized fields and the other an array with a minimum of 1 word of content. Each primitive object also has 3 or 4 words of overhead.) Java collection data structures similarly use far more memory than people realize.
These overheads are not GC-related per se. Rather they are direct and indirect consequences of design decisions in the Java language, JVM and class libraries. For example:
- Each Java primitive object header1 reserves one word for the object's "identity hashcode" value, and one or more words for representing the object lock.
- The representation of a String has to use a separate "array of characters" because of JVM limitations. Two of the three other fields are an attempt to make the
substring
operation less memory intensive. - The Java collection types use a lot of memory because collection elements cannot be directly chained. So for example, the overheads of a (hypothetical) singly linked list collection class in Java would be 6 words per list element. By contrast an optimal C/C++ linked list (i.e. with each element having a "next" pointer) has an overhead of one word per list element.
1 - In fact, the overheads are less than this on average. The JVM only "inflates" a lock following use & contention, and similar tricks are used for the identity hashcode. The fixed overhead is only a few bits. However, these bits add up to a measurably larger object header ... which is the real point here.
Explicitly calling garbage collection in .NET
This depends on how you trigger the GC.
GCCollectionMode
:
Default
The default setting for this enumeration, which is currently Forced.
Forced
Forces the garbage collection to occur immediately.
Optimized
Allows the garbage collector to determine whether the current time is optimal to reclaim objects.
If you call the parameterless overload or pass GCCollectionMode.Default
it currently forces a GC, but in theory that behaviour may change in future versions of .NET.
If you pass GCCollectionMode.Forced
it forces an immediate GC.
If you pass GCCollectionMode.Optimized
it's only a hint. I don't know how seriously the runtime treats this hint.
So if you want to either force a GC or make sure that it's only a hint, use the Collect(int generation, GCCollectionMode mode)
overload.
Is memory cleared before garbage collection?
Practically speaking, no, this doesn't happen. Overwriting memory you've just freed takes time, so there are performance penalties. "Secure" objects like SecureString are just wiping themselves, not relying on the GC.
More broadly, it depends very much on that particular implementation of that particular language. Every language that assumes the existence of a GC (like C#) specifies different rules about how and when garbage collection should happen.
To take your C# example, the C# specification does not require that objects be overwritten after being freed, and it doesn't forbid it either:
Finally, at some time after the object becomes eligible for collection, the garbage collector frees the memory associated with that object.
§3.9 C# 5.0 Language Specification
If the memory is later assigned to a reference type, you'll have a constructor that does your own custom initialization. If the memory is later assigned to a value type, it gets zeroed out before you can start reading from it:
Initialization to default values is typically done by having the memory manager or garbage collector initialize memory to all-bits-zero before it is allocated for use. For this reason, it is convenient to use all-bits-zero to represent the null reference.
§5.2 C# 5.0 Language Specification
Additionally, there's at least two implementations of C# -- Microsoft's implementation and Mono's implementation, so just saying "C#" isn't specific enough. Each implementation might decide to overwrite memory (or not).
Related Topics
Preventing SQL Injection on ASP.NET Web Application
Benefits of Use Parameters Instead of Concatenation
How to Make the .Net Httpclient Use Http 2.0
Entity Framework Specification Pattern Implementation
Is Idependencyresolver an Anti-Pattern
Entity Framework the Underlying Provider Failed on Open
Why Must C# Operator Overloads Be Static
Download a File from Azure Devops Server Writes Wrong Data to The File
C# Serialized JSON Date to Ruby
Getting Db Connection Through Singleton Class
Calling a JavaScript Function in The C# Webbrowser Control
Using JavaScript for Custom Purposes
Openssl.Net Porting a Ruby Example to C# (From Railscasts 143 Paypal-Security)
How to Get Error Information When Httpwebrequest.Getresponse() Fails