Tim Jansen's blog


2007/02/26
Final Post
It's not hard to guess after I haven't posted anything here for over a year, but I have stopped working on Eek. It's a bit hard for me to explain why though. For one, I hardly worked on any code in my private time in 2006. The only chance of creating a usable Eek version would have needed a huge time committment that I wasn't prepared to make. Now I am back to working on some projects, but Eek has a very low priority for me. Perhaps more importantly, I came to a point where I felt something was missing in Eek. My plan was to have a common data model for anything in Eek, EXML, and that's still one of the things I feel are missing in all other popular programming languages. But the language itself is missing a certain elegance and simplicity. And I didn't find a good concept to add it without sacrificing features that I consider important. At this point, my favorite language from a conceptual point-of-view is JavaScript. I first used JavaScript in the mid-90s when it was brand new, and then hardly touched it till last year. I have always respected the simplicity of its prototype-based object model even when I was not using, but I had no real reason to use it. This changed when I did some Ajax work last year, and I started to really like JavaScript. It just had the kind of consistent elegance that I wanted for Eek, but never found in the language itself. Thanks to libraries like Prototype and Mootools the often awkward APIs suddenly became usable. And the E4X extension adds the XML-processing capabilities that all other mainstream languages lack. Now I have had a look at the specs for JavaScript / ECMAScript 4th edition, which adds a class model and optional static typing to JavaScript, and it could become the language of my dreams. Right now the only implementation is Adobe Flash's ActionScript 3 which I haven't tried yet, but I am really looking forward to seeing it in Mozilla. The other language that I am currently playing with is Ruby. I looked at Ruby a long time ago, but never actually used it. Now my experiments are still in an early stage, but there are certain things that I like about it. It's conceptually not as simple as JavaScript, but the syntax is quite adorable and it's just nice to type. Possibly I will use it more often in the future. Unfortunately its lack of static type checks does not fit my programming style very well, and it seems to lack good XML support (whoever wrote that XML API it ships with has probably never really used XML in his/her life - the API has some strange features, like the definition of the Element#text attribute which does not work with documents that contain comments).



2005/10/31
Property Constraints
I have written a lot about Eek's constraint system for methods, but I rarely even mentioned property constraints. Maybe that is because I was never really happy about the property constraint system. It looked really simple, but in reality it was full of quirks.
Until last week my plan was allow "with" expressions for each property, just like method constraints. The "with" expressions are validated after setting the property. They would be executed in the object's context, thus with "this" pointing to the object. This approach looks simple and powerful, but unfortunately it is full of traps. One of the traps is making it possible for properties to depend on each other. Let's look at the following class:
class ThreeNumbers
 int a
  with a < b, a < c
 int b
  with a < b, b < c
 int c
  with a < c, b < c

 setAll(int newA, int newB, int newC)
  a = newA
  b = newB
  c = newC
end
Each property has two constraints which specify that at any given time the values must be a < b < c. The method setAll() is supposed to assign new values to the properties. So what's wrong with this? Let's look the following code:
 ThreeNumbers o(1, 5, 10)
 o.setAll(6, 7, 10) // error!
This does not work: in the method setAll()'s first line the value "6" is assigned to "a", which will fail because it violates the constraint "a < b" ("6 < 5"). It turns out that making properties depend on each other is not that simple, because it becomes really hard to update them.
There are several solutions for the problem of interdependent properties. One approach is that the property constraints are only validated when the method finished. Unfortunately that would mean that the class's code needs to be capable of handling properties that violate their constraints for a limited amount of time, and then constraints wouldn't be very useful anymore.
A second solution, the one that I favoured in the last year, was to introduce a special syntax that allows setting several properties simultanously and delay the checking of the constraints. This put a dent into Eek's complexity budget, but at least made them usable.
Unfortunately it wasn't the only problem. The next one is that in order to allow constraints to access properties in a natural way, the value must be assigned to the property and then the constraint expression can be executed. This will, again, allow the property value to violate the constraint for a short time. Maybe there are three expressions, and only the last one fails. Then two expressions are executed on an instance that has an invalid state. And if a constraint fails, the old value needs to be restored.
Things get even worse with accessor methods: return values of getter methods can not easily be constraint-checked if the constraints need to read the value themselves. An infinite recursion would be the result. The work-arounds made things so much more complex that I decided against constaint validation for reading properties, again reducing the usefulness of constraints.

So yesterday I rewrote property constraints in the specification and solved these problems by declaring the 'this' pointer to point to the constructor object (in Eek the constructor object is a singleton that contains the constructors and constants of a class). This will drastically reduce the functionality of constraints, because properties can not depend on each other anymore, but it also solves all the problems that I had until now. The constraint expression gets just a single argument, which has the name of the property, but the expression can not access any members of the object. Here's a example (which is kind of pointless, because it's hard to show what you can NOT do):
int age 
 with age >= 0
The property constraint is executed before the new value is assigned to the property (or the settor method id executed), as well as after reading the value (or invoking the gettor method). Thus it is always guaranteed that the property contains and returns a valid value (it is possible that you set a valid value to a property, and by the time you read it the property became invalid - this may happen when you use an external value in the constraint, such as a value that's provided by a singleton).
An alternative syntax that I originally considered was to call the new value in the constraint expression 'it', like the special variable that I use in anonymous closures. I am not sure whether that may be a good idea: it would makes it more obvious that you can not access the object's members, but it also feels more awkward. Anyway, I am quite happy with the new solution. Getting rid of 'this' makes property constraints so much simpler and less error prone. The new solution also allow an important optimization compared to the old one, because now constraints can be used as a guarantee for a restricted value range, but that may be the topic of a future blog entry..



Mix-ins
Several months ago I promised to write about Eek's mix-in implementation as soon as I have it in the specification. Well, I did finish the specification part, but was never in the right mood to blog about it. The reason for this is that Eek's mixins are actually quite boring. I just renamed interfaces to mixins and gave them the same syntax and capabilities that regular Eek classes have. Like Java interfaces they can still be used as reference types. This, and allowing mix-ins to have non-virtual properties, makes them quite different from many other mix-in (traits etc) implementations, but I think it's the simplest and most powerful way to implement them in a statically typed language. I thought about many other options, like making 'virtual' default for all mixin members (in classes the default is non-virtual) and so on, but eventually I came to the conclusion that consistency is more important than syntactic sugar.

So a Eek mixin could look like this (based on the Cedric Beust example, but with an extra method because Eek does not need accessors):
public mixin Namable
 String shortName
 String longName

 getBothNames(): String
  return "{shortName} ({longName})"
end
To import the mixin, the new "mix" keyword is used instead of "implements":
public class Employee mix Namable
end


And, finally, I did one small but important change to the member modifiers: a member can be "private" and "virtual" at the same time. This has almost same effect as "protected" in C++ and Java, with the difference that only those sub-classes that override the member can access it. I have never liked 'protected' in base classes, but for mixins it actually makes sense.



2005/09/18
C# 3.0 features pulled to pieces

Just found out on Slashdot that the details on C# 3.0 are out. This word doc gives a nice overview over C# 3.0's features. I already read about some of its features and was a bit scared because it seemed to contain some features that sounded interesting. But now, after reading it, I realized that I had no reason to worry...
In order to understand the following comments you should have read the C# 3.0 overview.

26.1 Implicitly typed local variables The var keyword is a bit like Eek's any type. However, it is only intended to reduce the amount of characters that you need to write. It does not offer late-binding or any other of any's advantages. Instead it brings some new oddities that will confuse inexperienced programmers (like forbidding to assign null to a var), and increases the language's complexity budget. I would have opted against that feature.

26.2 Extension methods Extension methods are a feature that can be quite tempting: adding methods to a class that you did not write. It does not offer any semantical advantages, but can save you a few keystrokes. Sometimes I'd like to have that feature as well. So I thought a lot about it - and decided against it. There are two reasons against it: first of all, it's another odd syntax to increase the complexity budget and that will confuse newbies, because it is a rarely needed feature. Second, it makes it harder to find a method's implemention. Without extension methods the implementation is easy to find: just look at the class. However, when you can add methods to a class at any place, the methods can be anywhere as well. Only when you look at all imported classes you have a chance of finding the methods implementation. Or, admittedly, with a good IDE. But all this trouble only to make method invocations a bit more convenient? No thanks. (BTW I don't like AOP for similar reasons).

26.3 Lambda expressions Now this is a crazy feature. C# already has anonymous methods. Because the anonymous method syntax is quite verbose, they add a second mechanism? Especially one that's so complicated (because of the type inference and the somewhat odd syntax)? I really wonder why... Eek is much simpler, as descibed here. I have one closure feature that's similar to C#'s anonymous methods, and as a special case anonymous closures .The latter are even more compact than C#'s lambda expressions, but do not allow more than one input argument and one return value. I considered extending the syntax to more input arguments though, it would be possible.

26.4.1 Object initializers Once again a quite complicated and potentially confusing syntax. Eek has them as property initializers, but they belong to the constructor's arguments. To create the class

class Point
        int x, y
end
you'd only have to write
any a = Point(x: 0, y: 1)

26.4.2 Collection initializers That's a feature that does not need any language syntax support in Eek. A constructor with variable argument number length is sufficient. To initialize the list in the C# example, you would have to write

List[int] digits = Init(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
Init is a initializer class whose constructor takes an unlimited amount of any values as arguments. For optimal performance the compiler should know the Init class and make some optimizations, of course, but there is no need to introduce a new syntax.

26.5 Anonymous types Anonymous types are another strange feature without much use for most use cases, at least without duck-typing. I understand that they are required for the next feature, but they are still another blow for the complexity budget.

26.7 Query expressions The real surprise is the embedded query language. I also played with the idea of including XPath, but decided against it. No query language on the world is worth the complexity that it adds to the language. Obviously the C# designers did not agree - even though their query language does not even offer any new capabilities, but is just syntactic sugar for method invocations with delegate arguments.

Eek has the much simpler filter operator, which can do what C#'s where declaration does. With the following class declarations

class Customer
        String name, city
        List[Order] orders
end

class Order
        Customer customer
        Date date
end
, you could select all customers from London with the method:
getAllLondonCustomers(List[Customer] l): List[Customer]
        return l.{it.city == "London"}
and get all orders from London in the year 2004 with the method
getAllLondonOrdersFrom2004(List[Customer] l): (List[Order] results())
        for Customer c in l.{it.city == "London"}
                results += c.orders.{it.date.year == 2004}
I dont think that this is worse than the C# example:
from c in customers
where c.City == "London"
from o in c.Orders
where o.OrderDate.Year == 2005
select new { c.Name, o.OrderID, o.Total }
The main difference is the result set. The C# example returns an anonymous type which can include elements from all involved classes, which Eek can return only orders. So in order to access the customer name you'd have to write order.customer.name.




2005/08/19
Update
This is going to be another entry that I write just for the sake of writing something. There is not much to say right now. Progress is rather slow in the last 2 months, and I blame it on the summer. However I did start writing the interpreter for EDL in Java and am currently (slowly) working on a Java-native implementation of Eek's core library classes. Beside that, I am currently thinking about another major change in the Eek language, inspired by this Cedric Beust post. In short, the idea is to replace interfaces with mix-ins. There are some details that I would have to work out, but I am confident that I will do the changes to the spec in the next weeks and will then write about it here.



2005/07/11
Static Typing Where Possible, Dynamic Typing When Needed
Lambda the Ultimate points to a paper called "Static Typing Where Possible, Dynamic Typing When Needed" by Erik Meijer and Peter Drayton that I find somewhat scary. It describes many importants parts of Eek's type system. But it also includes a few thing I do not agree with, like the prototype inheritance and lazy evaluation sections.



2005/04/30
EDL
It's been three months since the last post, so I think I should write an update. The Eek language spec is mostly done, even though I still do some minor changes from time to time. Most of them remove restrictions from the language, for example my last change was to allow generic types in delegate signatures. They are the result of the evolution of Eek's intermediate language, EDL. My original idea of EDL was to have something like Java Bytecode, just in EXML. Instead of having a list of binary-encoded commands, I would have a list of elements that describe the commands. The next step was to use the tree-like structure of EXML and make it more like an AST rather than a list of commands. Thus the arguments of a command are children of the command element, and the children arguments are commands themselves. I still wanted have special commands for build-in types like integers, like Java does. Then I discovered that EXML is ideal for annotating the commands in the tree, which makes type inference relatively easy to do (among other optimizations). That shifted my goals somewhat, and I focussed on making EDL simple to parse and process. This also meant getting rid of all special cases, and the end result is a completely dynamically typed language that uses 13 different commands in a tree structure. Making EDL dynamically-typed may look odd, considering that I am a proponent of statically typed programming and Eek is probably even more statically typed than Java, but I believe that with a good compiler the resulting code will be not slower than the code of a typical compiler of statically typed code. The price is that EDL's design makes it more difficult to write a compiler with a performance that it comparable to a typical, 'stupid' compiler. But the reward is that EDL should it make easier to create a 'smart', faster compiler... The translation from statically typed Eek code to EDL will look quite unusual. For example, the following two methods will produce identical EDL code:
add1(int a, int b): int
        return a + b

add2(any a, any b):(any r)
        with a instanceof int, b instanceof int
        returns r instanceof int

        return a + b
The 'any' type is a late-binding reference type that I described here. In EDL all reference types are always equivalent to Eek's 'any'. The lines starting with 'with' and 'returns' are constraints that check the types of the input arguments and the return type. More on constraints in Eek here. So why does add2() have the constraints? The constraints tell the type-infering compiler that 'a' and 'b' are always integers. This allows the compiler to use the usual early-binding tricks. The compiler may also just remove the constraints, if all invocations of the method are guaranteed to use only integers. Or it may create two methods, one that keeps the constraints and one that doesn't. Right now, EDL is only described on a couple of OneNote pages. Many details are still open and I need to write it down in a more formal description. When this is done, I am going to post some examples of real EDL code.



 

This blog is my dumping ground for thoughts and ideas about Eek. Someday Eek will be a programming language and system, somewhat comparable to Java in scope. It is my attempt to bring sanity to the world of computing.
At least I hope so. Right now it is far from being finished and I can't guarantee that it ever will be. I am still working on the specification, but I won't release anything before I got my first prototype running. The world does not need more vapourware and unusable beta-software. All publicly available information about Eek is contained in this blog. You can find the latest summary here.
This page is powered by Blogger. Isn't yours? Creative Commons License
This work is licensed under a Creative Commons License.