Tim Jansen's blog


2003/12/20
Resource management in garbage collecting languages
Resource management in garbage collecting languages One of my favorite C++ features is resource management with stack-allocated objects. It can hardly get more convenient than writing
	{
		QMutexLocker m(myMutex);
	}
to protect a resource from concurrent access. Java even introduced a keyword (synchronized) to get the same effect, but it is only useful for thread synchonization. In C++ you can use the same mechanism for everything, from files to database transactions. Java gives you the choice between creating a try/finally block to deallocate the resource explicitly and keeping the resource until the object is finalized, which may keep the resource allocated for an infinite time. It’s a common error that people write something like this
	void myNaiveMethod() throws Exception {
		FileWriter fw = new FileWriter("file.txt");
		doSomething();
		fw.write("Everything is fine");
		fw.close();
	}
The function looks ok, but if doSomething() or write() throw an Exception the FileWriter will not be closed. This is especially nasty in long running systems, like servlets or JSPs in a web server, because the errors caused by code like this are hard to reproduce. They depend on whether the garbage collector finalized the object before the resource is needed again. The right way to do this in Java is to write
	void myCorrectMethod() throws Exception {
		FileWriter fw = new FileWriter("file.txt");
		try {
			doSomething();
			fw.write("Everything is fine");
		}
		finally {
			fw.close();
		}
	}
Using try/finally is tedious, and the code becomes hard to read when you have two or more resources allocated, as you need to nest the try/finally blocks then. In other words, Java really sucks at resource management. That does not mean that C++’s mechanism is completely safe. It guarantees that the resource will be deallocated, but C++ makes it easy to shoot yourself in the foot:
	void myFunction() {
		QFile f("file.txt");
		writeSomething(&f);
	}
This will only work if writeSomething() does not keep the pointer to f. If writeSomething() does keep it, you will have a nasty bug. Especially when writeSomething() uses it only in rare cases.

C# came up with an interesting solution: the ‘using’ statement. A class that implements the IDisposable interface can be notified when a ‘using’ block is left:
	void myFunction() {
		using (StreamWriter sw = new StreamWriter("file.txt")) {
			writeSomething(sw);
		}
	}
After writeSomething() the system will invoke the dispose() method of StreamWriter that tells it to free the resource. This solution is better than try/finally, but still has problems:
  1. If you need to allocate several resources, you have the same nesting problem as in Java
  2. If writeSomething() keeps the reference, you have lost. It is not at bad as with C++, because you have a better chance of getting a usable error notification, but the problem is not completely solved either
  3. The syntax is not as short as C++’s. Actually it seems to be annoying enough that C# got another statement, ‘lock’, that does exactly the same thing as Java’s ’synchronized’.
Problem 2 is the most difficult one. In an ideal world an error would be thrown if there is still a reference to the StreamWriter after the ‘using’ statement’s block. However that is almost impossible with a garbage collector, unless you would enforce the garbage collection immediately after leaving the ‘using’ block. This may be a nice thing for a debugging mode, but not for production code. Reference counters would alse solve the problem quite easily, but they have other problems. Combining both may be a solution though. (Anyone who knows a system that combines both? I wouldn’t be surprised if that would be faster than a pure GC, because of the better caching behaviour)

The problems 1 and 3 are just syntax problems. Number 1 could be solved by making the ‘using’ statement use the current scope. When the current scope is left, the object will be notified:
	void copyFile() {
		using StreamReader src = new StreamReader("source.txt");
		using StreamWriter dest = new StreamWriter("destination.txt");
		copyStream(src, dest);
	}
Problem 3 is mainly caused by the redundant syntax of creating an object and assigning it to a local variable. Both Java and C# took the C++ syntax for heap-allocated object creation. This makes the code easy to understand for C++ programmers, but it does not make sense when you can only create reference types. So why not eliminate the redundant new and use the C++ auto-allocation syntax for initializing references? This would not only benefit the ‘using’ keyword, it would also shorten a lot of code:
	void copyFile() {
		using StreamReader src("source.txt");
		using StreamWriter dest("destination.txt");
		copyStream(src, dest);
	}
Now it is almost as convenient as C++ code. The last difference is the ‘using’ keyword. One possibility is to replace it with a keyword in the class declaration, so the class is always notified when its creation scope has been left. But this would be a bad idea, because often it may makes sense to make a resource live longer than the function that created it. Another reason is that it would make the code harder to read - unlike C++ there is no difference between auto-allocated and heap-allocated objects. There is one problem with the syntax though: what happens when the developer, for whatever reasons, modifies a variable that has been created with ‘using’?
	void copy2Files() {
		using StreamReader src("source1.txt");
		using StreamWriter dest("destination.txt");
		
		copyStream(src, dest);
		src = StreamReader("source2.txt");
		copyStream(src, dest);
	}
If a local variable is declared with ‘using’ and it is changed, the original object should be disposed. And when the function’s scope has been left, the new object should be disposed as well. As a final optimization, it would be possible to allow ‘using’ also as a modifier for a ‘anonymous’ constructor invokation:
void main() {
	copyStream(using FileStream("source.txt"), using FileStream("destination.txt"));
}
I think a ‘using’ modifier would be a better solution for the resource management than Java’s try/finally and C#’s ‘using’ statement. There’s another challenge left: it is often important to know whether a function will keep a pointer or not, in C++ even more than in Java and C#. Right now the best practise is to point out in the documentation whether a function will keep an object and for how long. What’s a good syntax to state this in the function prototype and ensure that the function will not keep a reference otherwise?


 

This blog is my dumping ground for thoughts and ideas about Eek. Someday Eek will be a programming language and system, somewhat comparable to Java in scope. It is my attempt to bring sanity to the world of computing.
At least I hope so. Right now it is far from being finished and I can't guarantee that it ever will be. I am still working on the specification, but I won't release anything before I got my first prototype running. The world does not need more vapourware and unusable beta-software. All publicly available information about Eek is contained in this blog. You can find the latest summary here.
This page is powered by Blogger. Isn't yours? Creative Commons License
This work is licensed under a Creative Commons License.