![C++ vs Java: UB vs Semantic Memory Leaks C++ vs Java: UB vs Semantic Memory Leaks](http://ithare.com/wp-content/uploads/BB_part206_CvsJava_v1-1-640x427.png)
For a long while, quite a few people (mostly from academy and/or Java programming teams) faithfully believed in a horrible misperception along the lines of “Garbage-collected programs cannot possibly memory leak” (or at the very least along the lines of “it is fundamentally more difficult to have a memory leak in the garbage-collected program”, which public readily translates into the former) [GC-FAQ][C2-GC]. This is in spite of issues related to memory leaks in Java, were discussed at least as early as in 1999 [Lycklama99], and are often discussed at about the same place as the misperception above [C2-MemoryLeaksGC].
However, the reality of {most|quite a few|some}1 real-world Java programs being horrible memory-eaters over time, was knocking on the door more and more persistently, and by 2017 at least opinion leaders came to the understanding that[Sor17][Paraschiv17][Java8docs.MemLeaks][etc. etc. etc.]
there ARE memory leaks in Java
1 pick one depending on the camp you’re in, but don’t forget about Eclipse and OpenHAB
Syntactic vs Semantic Memory Leaks
The problem with the misperception above comes from a subtle difference between what is known as “syntactic memory leaks” and “semantic memory leaks” (named “loiterers” in [Lycklama99]). Sure, any half-decent garbage collector will ensure that unreachable objects are cleaned up2; however, while all unreachable objects are useless,
not all useless objects are unreachable
It is fairly common to call those objects which are unreachable but still present in the program, syntactic memory leaks, and those objects which are useless but still reachable, semantic memory leaks.
So far so good, but now we have to observe that from the point of view of the end-user of the program, I do not care about unreachability – not at all; instead, what I do care about is the program not going into swap after half a day of use; as practice shows – even with all the unreachable objects being removed (i.e. even if there are no syntactic memory leaks), those semantic memory leaks can easily cause that dreaded swapping.
2 actually, it is “are eventually clean up”, but in a true spirit of being nice to those-already-suffering we will forget about this eventually word for the time being
Semantic Memory Leaks in Java
There are quite a few common scenarios how memory leaks can appear in Java (see, for example, classification in [Lycklama99]), but most of them3 boil down either to forgetting to remove a reference-to-an-item from some collection, or to forgetting to set a no-longer-needed reference to null. Indeed, if we keep something-useless within a collection, or are keeping a reference to a no-longer-needed object without any chance to use this reference again – we do have a semantic memory leak.
![BB_emotion_0009b.png](http://ithare.com/wp-content/uploads/BB_emotion_0009b.png)
One such example is an object with a reference held by main() function. More generally – as soon as we have any kind of top-level loop – such as event loop – then all the objects held for us by the event loop, including all the objects reachable via references coming from any of such objects, DO need their references null’ed manually to avoid such references from becoming semantic memory leaks.
3 saving for JVM peculiarities or esoteric stuff such as ClassLoaders
What about C/C++?
So, in Java, to avoid semantic memory leaks, we DO need to use x = null; to avoid memory leaks. But this is an exact equivalent of explicit delete which have to do in C/C++(!), albeit for a different reason (to avoid dangling pointers)!
Let’s compare the following three pieces of code:
//pre-C++11 C++
struct State {
uint8_t* data;
void addData() {
data = new uint8_t[1000000];
//do something with data
}
void removeData() {
delete [] data;
data = nullptr;//(*)
}
~State() {
delete [] data;
}
};
//post-C++11 C++
struct State {
std::unique_ptr<uint8_t[]> data;
void addData() {
data = make_unique<uint8_t[]>(1'000'000);
//do something with data
}
void removeData() {
data.reset();//(*)
}
};
//JAVA
class State {
byte[] data;
void addData() {
data = new byte[1000000];
//do something with data
}
void removeData() {
data = null;//(*)
}
};
From my current perspective, these three pieces of code are semantically identical (i.e. the only difference is about syntax – which is TBH is not too different either).
Are They Really Identical? Well, Not Exactly…
In spite of these striking similarities between what can be seen as “safe and memory-leak-free code” under two supposedly-very-different-in-this-regard programming languages, there is still a major difference.
Specifically, if we forget to assign null to data in line marked with (*) (or to call reset() for post-C++11 C++), effects will be different:
- Java is significantly more lenient in this regard, and forgotten data = null is punished only with the semantic memory leak.
- OTOH, it is this lenience which leads to Java programs with semantic memory leaks being ubiquitous: a C++ program which crashes is an obvious bug which is much more likely to be fixed than Java program with a semantic memory leak (among other things, memory leaks are often not obvious until somebody runs the program for many hours – which might be ignored in most of the routine testing
- Moreover, in Java there is a chance to have an instance of some other class to refer to data even after we null’ed it here. From what I seen, such hidden references is a major source of semantic memory leaks in complicated real-world Java programs.
- OTOH, it is this lenience which leads to Java programs with semantic memory leaks being ubiquitous: a C++ program which crashes is an obvious bug which is much more likely to be fixed than Java program with a semantic memory leak (among other things, memory leaks are often not obvious until somebody runs the program for many hours – which might be ignored in most of the routine testing
- post-C++11 C++ behaves much more like Java in this regard.
- It is still quite different from Java because C++’s unique_ptr<> is guaranteed to be the only reference to the data object. This, in turn, eliminates those Java-like hidden references, and in turn greatly reduces chances of us having a semantic memory leak. However, under C++ such a hidden reference will become a dangling pointer, causing once again dreaded UB/crash/memory corruption <ouch! />.
Summary
Attempting to summarize my ranting above:
- Code which can be considered ‘good’ memory-wise (is safe both from crashes and memory leaks) is strikingly similar under C++ and Java.
- Yes, contrary to what-lots-of-the-books tend to tell us, even when programming in Java we DO have to think about memory management (hey, one can argue that data = null IS manual memory management).
- However, IF we deviate from such ‘good’ code practices, different programming languages will punish us differently (in C++ in can be a crash or memory corruption, in Java it can be a semantic memory leak).
- OTOH, as memory leaks are not AS obvious as crashes, they have a tendency to survive longer (often MUCH longer). In other words, when moving from C++ to Java, we tend to trade A FEW crashes for A LOT of memory leaks; which is BTW tends to be consistent with whatever personal experience / anecdotal evidence I have. I am not going to argue whether it is a good trade-off or not; what is IMNSHO more important is that semantics of good code is about the same regardless of Java/C++ choice. Dixi.
References
[GC-FAQ] “GC FAQ”
[C2-GC] “Garbage Collection”
[Lycklama99] Ed Lycklama, “Does Java
![™ ™](https://s.w.org/images/core/emoji/11.2.0/72x72/2122.png)
[C2-MemoryLeaksGC] “Memory Leak Using Garbage Collection”
[Sor17] Vladimir Sor, “Memory Leaks: Fallacies and Misconceptions”
[Paraschiv17] Eugen Paraschiv, “How Memory Leaks Happen in a Java Application”
[Java8docs.MemLeaks] “Debug a Memory Leak Using Java Flight Recorder”, Oracle
[etc. etc. etc.] Google for 'Java memory leak'
Acknowledgement
Cartoons by Sergey Gordeev
![IRL IRL](http://ithare.com/wp-content/uploads/irl-link.png)
P.S.
Don't like this post? Criticize↯
P.P.S.
We've tried to optimize our feed for viewing in your RSS viewer. However, our pages are quite complicated, so if you see any glitches when viewing this page in your RSS viewer, please refer to our original page.