<?xml version='1.0' encoding='UTF-8'?><rss xmlns:atom='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' version='2.0'><channel><atom:id>http://www.blogger.com/feeds/6927046/posts/full</atom:id><lastBuildDate>Thu, 21 Dec 2006 14:19:13 +0000</lastBuildDate><title>Tim Jansen's blog</title><description></description><link>http://www.tjansen.de/blogen/index.html</link><managingEditor>Tim Jansen</managingEditor><generator>Blogger</generator><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>15</openSearch:itemsPerPage><item><guid isPermaLink='false'>http://www.blogger.com/feeds/6927046/posts/full/117244722629525627</guid><pubDate>Sun, 25 Feb 2007 23:18:00 +0000</pubDate><atom:updated>2007-02-26T00:47:06.307+01:00</atom:updated><title>Final Post</title><description>&lt;div xmlns="http://www.w3.org/1999/xhtml"&gt;It's not hard to guess after I haven't posted anything here for over a year, but I have stopped working on Eek. It's a bit hard for me to explain why though. For one, I hardly worked on any code in my private time in 2006. The only chance of creating a usable Eek version would have needed a huge time committment that I wasn't prepared to make. Now I am back to working on some projects, but Eek has a very low priority for me.

Perhaps more importantly, I came to a point where I felt something was missing in Eek. My plan was to have a common data model for anything in Eek, EXML, and that's still one of the things I feel are missing in all other popular programming languages. But the language itself is missing a certain elegance and simplicity. And I didn't find a good concept to add it without sacrificing features that I consider important.

At this point, my favorite language from a conceptual point-of-view is JavaScript. I first used JavaScript in the mid-90s when it was brand new, and then hardly touched it till last year. I have always respected the simplicity of its prototype-based object model even when I was not using, but I had no real reason to use it. This changed when I did some Ajax work last year, and I started to really like JavaScript. It just had the kind of consistent elegance that I wanted for Eek, but never found in the language itself. Thanks to libraries like Prototype and Mootools the often awkward APIs suddenly became usable. And the E4X extension adds the XML-processing capabilities that all other mainstream languages lack. Now I have had a look at the specs for JavaScript / ECMAScript 4th edition, which adds a class model and optional static typing to JavaScript, and it could become the language of my dreams. Right now the only implementation is Adobe Flash's ActionScript 3 which I haven't tried yet, but I am really looking forward to seeing it in Mozilla.

The other language that I am currently playing with is Ruby. I looked at Ruby a long time ago, but never actually used it. Now my experiments are still in an early stage, but there are certain things that I like about it. It's conceptually not as simple as JavaScript, but the syntax is quite adorable and it's just nice to type. Possibly I will use it more often in the future. Unfortunately its lack of static type checks does not fit my programming style very well, and it seems to lack good XML support (whoever wrote that XML API it ships with has probably never really used XML in his/her life - the API has some strange features, like the definition of the Element#text attribute which does not work with documents that contain comments).&lt;/div&gt;</description><link>http://www.tjansen.de/blogen/2007/02/final-post.html</link><author>Tim Jansen</author></item><item><guid isPermaLink='false'>http://www.blogger.com/feeds/6927046/posts/full/113078728745763847</guid><pubDate>Mon, 31 Oct 2005 19:19:00 +0000</pubDate><atom:updated>2005-10-31T20:34:47.533+01:00</atom:updated><title>Property Constraints</title><description>&lt;div xmlns="http://www.w3.org/1999/xhtml"&gt;I have written a lot about Eek's &lt;a href="http://www.tjansen.de/blogen/2004/02/argument-constraints-in-function.html"&gt;constraint system for methods&lt;/a&gt;, but I rarely even mentioned property constraints. Maybe that is because I was never really happy about the property constraint system. It looked really simple, but in reality it was full of quirks. &lt;br /&gt; 
Until last week my plan was allow "with" expressions for each property, just like method constraints. The "with" expressions are validated after setting the property. They would be executed in the object's context, thus with "this" pointing to the object. This approach looks simple and powerful, but unfortunately it is full of traps. One of the traps is making it possible for properties to depend on each other. Let's look at the following class:
&lt;pre&gt;
class ThreeNumbers
 int a
  with a &amp;lt; b, a &amp;lt; c
 int b
  with a &amp;lt; b, b &amp;lt; c
 int c
  with a &amp;lt; c, b &amp;lt; c

 setAll(int newA, int newB, int newC)
  a = newA
  b = newB
  c = newC
end
&lt;/pre&gt;
Each property has two constraints which specify that at any given time the values must be &lt;i&gt;a &amp;lt; b &amp;lt; c&lt;/i&gt;. The method setAll() is supposed to assign new values to the properties. So what's wrong with this? Let's look the following code:
&lt;pre&gt;
 ThreeNumbers o(1, 5, 10)
 o.setAll(6, 7, 10) // error!
&lt;/pre&gt;
This does not work: in the method setAll()'s first line the value "6" is assigned to "a", which will fail because it violates the constraint "a &amp;lt; b" ("6 &amp;lt; 5"). It turns out that making properties depend on each other is not that simple, because it becomes really hard to update them. &lt;br /&gt;
There are several solutions for the problem of interdependent properties. One approach is that the property constraints are only validated when the method finished. Unfortunately that would mean that the class's code needs to be capable of handling properties that violate their constraints for a limited amount of time, and then constraints wouldn't be very useful anymore.&lt;br /&gt;
A second solution, the one that I favoured in the last year, was to introduce a special syntax that allows setting several properties simultanously and delay the checking of the constraints. This put a dent into Eek's complexity budget, but at least made them usable. &lt;br /&gt;
Unfortunately it wasn't the only problem. The next one is that in order to allow constraints to access properties in a natural way, the value must be assigned to the property and then the constraint expression can be executed. This will, again, allow the property value to violate the constraint for a short time. Maybe there are three expressions, and only the last one fails. Then two expressions are executed on an instance that has an invalid state. And if a constraint fails, the old value needs to be restored. &lt;br /&gt;
Things get even worse with accessor methods: return values of getter methods can not easily be constraint-checked if the constraints need to read the value themselves. An infinite recursion would be the result. The work-arounds made things so much more complex that I decided against constaint validation for reading properties, again reducing the usefulness of constraints.&lt;br /&gt;&lt;br /&gt;

So yesterday I rewrote property constraints in the specification and solved these problems by declaring the 'this' pointer to point to the constructor object (in Eek the constructor object is a singleton that contains the constructors and constants of a class). This will drastically reduce the functionality of constraints, because properties can not depend on each other anymore, but it also solves all the problems that I had until now. The constraint expression gets just a single argument, which has the name of the property, but the expression can not access any members of the object. Here's a example (which is kind of pointless, because it's hard to show what you can NOT do):
&lt;pre&gt;
int age 
 with age &gt;= 0
&lt;/pre&gt;
The property constraint is executed before the new value is assigned to the property (or the settor method id executed), as well as after reading the value (or invoking the gettor method). Thus it is always guaranteed that the property contains and returns a valid value (it is possible that you set a valid value to a property, and by the time you read it the property became invalid - this may happen when you use an external value in the constraint, such as a value that's provided by a singleton).&lt;br /&gt;
An alternative syntax that I originally considered was to call the new value in the constraint expression 'it', like the special variable that I use in anonymous closures. I am not sure whether that may be a good idea: it would makes it more obvious that you can not access the object's members, but it also feels more awkward.

Anyway, I am quite happy with the new solution. Getting rid of 'this' makes property constraints so much simpler and less error prone. The new solution also allow an important optimization compared to the old one, because now constraints can be used as a guarantee for a restricted value range, but that may be the topic of a future blog entry..&lt;/div&gt;</description><link>http://www.tjansen.de/blogen/2005/10/property-constraints.html</link><author>Tim Jansen</author></item><item><guid isPermaLink='false'>http://www.blogger.com/feeds/6927046/posts/full/113078627462662911</guid><pubDate>Mon, 31 Oct 2005 19:08:00 +0000</pubDate><atom:updated>2005-10-31T20:17:54.626+01:00</atom:updated><title>Mix-ins</title><description>&lt;div xmlns="http://www.w3.org/1999/xhtml"&gt;Several months ago I promised to write about Eek's mix-in implementation as soon as I have it in the specification. Well, I did finish the specification part, but was never in the right mood to blog about it. The reason for this is that Eek's mixins are actually quite boring. I just renamed interfaces to mixins and gave them the same syntax and capabilities that regular Eek classes have. Like Java interfaces they can still be used as reference types. This, and allowing mix-ins to have non-virtual properties, makes them quite different from many other mix-in (traits etc) implementations, but I think it's the simplest and most powerful way to implement them in a statically typed language. I thought about many other options, like making 'virtual' default for all mixin members (in classes the default is non-virtual) and so on, but eventually I came to the conclusion that consistency is more important than syntactic sugar. &lt;br /&gt;&lt;br /&gt;
So a Eek mixin could look like this (based on the &lt;a href="http://beust.com/weblog/archives/000312.html"&gt;Cedric Beust example&lt;/a&gt;, but with an extra method because Eek does not need accessors):&lt;pre&gt;
public mixin Namable
 String shortName
 String longName

 getBothNames(): String
  return "{shortName} ({longName})"
end
&lt;/pre&gt;
To import the mixin, the new "mix" keyword is used instead of "implements":&lt;pre&gt;
public class Employee mix Namable
end
&lt;/pre&gt;
&lt;br /&gt;&lt;br /&gt;
And, finally, I did one small but important change to the member modifiers: a member can be "private" and "virtual" at the same time. This has almost same effect as "protected" in C++ and Java, with the difference that only those sub-classes that override the member can access it. I have never liked 'protected' in base classes, but for mixins it actually makes sense.&lt;/div&gt;</description><link>http://www.tjansen.de/blogen/2005/10/mix-ins.html</link><author>Tim Jansen</author></item><item><guid isPermaLink='false'>http://www.blogger.com/feeds/6927046/posts/full/108419704627984526</guid><pubDate>Tue, 23 Dec 2003 01:43:00 +0000</pubDate><atom:updated>2005-10-31T20:14:39.906+01:00</atom:updated><title>Property Syntax Revised</title><description>&lt;div xmlns="http://www.w3.org/1999/xhtml"&gt;Since I wrote the last &lt;a href="http://www.kdedevelopers.org/node/view/276"&gt;entry about properties&lt;/a&gt;, the comments and &lt;a href="http://groovy.codehaus.org/"&gt;Groovy&lt;/a&gt; changed my mind about the property syntax:
&lt;ul&gt;
&lt;li&gt;I think the accessor method syntax that panzi proposed is much better than my Java-like syntax or the C# syntax.
&lt;li&gt;If the language uses the &amp;#8216;virtual&amp;#8217; keyword for virtual methods, virtual properties (properties without associated member field) can not use &amp;#8216;virtual&amp;#8217; as a keyword. Otherwise it would not be possible to override the accessors in a sub-class. But the keyword is not needed anyway, because the new accessor syntax can unambigously define a property. You just need to write one or both accessor methods. For the API documentation only one accessor method must be documented, and it should be documented like a field (and not like a function)
&lt;li&gt;Groovy has the simple and effective idea that all public field members are
properties. This removes the need for the &amp;#8216;property&amp;#8217; keyword and also the difference between properties and fields. Just add a regular member, and it is accessed using auto-generated accessor methods, that can be overwritten by you
&lt;li&gt;There&amp;#8217;s one drawback when properties are accessed like field-members: you can&amp;#8217;t control anymore whether you access the field directly, or using the accessor methods. This can only be avoided with a syntax extension, and I think the least painful is the following: a method can access the raw field member of the object class without the accessor methods by prefixng the name with a &amp;#8216;@&amp;#8217;. It is not allowed to use this kind of access for other instances or classes (thus only &amp;#8216;@field&amp;#8217; is allowed, but never &amp;#8216;ref.@field&amp;#8217;). &lt;br&gt;
In order to prevent errors, the accessors must not call themselves and 
thus the attempt to read the field without &amp;#8216;@&amp;#8217; would cause a compilation error.
&lt;/ul&gt;

Here is the example class with these changes:
&lt;pre&gt;
class SubString  {
        private int mEndIndex;

        /// Documentation!      
        public int beginIndex;

        /// Documentation!      
        public int length.get() const {
                return mEndIndex - @beginIndex;
        }

        public void length.set(int value) {
                mEndIndex = value + @beginIndex;
        }
};
&lt;/pre&gt;
It&amp;#8217;s short. I am not very happy about the &amp;#8216;@&amp;#8217; thing though.&lt;/div&gt;</description><link>http://www.tjansen.de/blogen/2003/12/property-syntax-revised.html</link><author>Tim Jansen</author></item><item><guid isPermaLink='false'>http://www.blogger.com/feeds/6927046/posts/full/108419712072324819</guid><pubDate>Sat, 27 Dec 2003 08:19:00 +0000</pubDate><atom:updated>2005-10-31T20:14:30.566+01:00</atom:updated><title>10 Things I Hate About XML</title><description>&lt;div xmlns="http://www.w3.org/1999/xhtml"&gt;&lt;ol&gt;
&lt;li&gt;DTDs and everything in the &amp;#60;!DOCTYPE&gt; tag is horrible. The syntax is cryptic, the allowed types are odd and the degree of complexity is very high (parameter entity references!). RelaxNG and even XML Schema are much better solutions, and the XML specification could be reduced by at least 75%.
&lt;li&gt;Entity references are not needed in a Unicode world (exceptions: the predefined entities and character references).
&lt;li&gt;Processing instructions are an odd and unstructured mechanism for meta-data about the XML and should not be needed anymore, because namespace&amp;#8217;d elements and attributes could achieve the same.
&lt;li&gt;CData sections may be somewhat useful when writing code by hand, but that does not compensate for the complexity that they add to document trees - without them there would be only one type of text.
&lt;li&gt;Different char sets. There&amp;#8217;s no real need to allow different charsets in XML, it just hurts interoperability. It should be at least restricted to the three UTF encodings, maybe even only one of them. Allowing charsets like &amp;#8216;latin1&amp;prime; is useless if processors are not required to support them.
&lt;li&gt;The lack of rules for whitespace handling. Actually there would be a very simple and sane rule for whitespace handling (always return whitespace unless a element contains only elements and does not have xml:space="preserved&amp;#8221; set), but the specs require the XML processor to return even the useless whitespace.
&lt;li&gt;The XML specification should set up rules that specify how to handle namespace&amp;#8217;d elements and attributes that are not supported by the application. Right now the schema defines how to handle them and the application will not get any support by the XML processor. Ideally the application should tell the XML parser which namespaces it supports, and the XML specification should define what the XML parser does with the rest.
&lt;li&gt;xml:lang is pretty useless without more rules for the XML processor. It would make sense if the XML parser could somehow  only deliver text in the desired language to the application, but without any useful function it just bloats the specification.
&lt;li&gt;XML Namespaces are probably the greatest invention in XML history, but they should be in the core specification. Otherwise the APIs are splitted into namespace-aware functions and those that ignore them. The main problem is that the &amp;#8216;:&amp;#8217; character has no special meaning in the core specification, so you can have well-formed XML with undefined prefixes, several colons in a single name and so on&amp;#8230;
&lt;li&gt;XML Schema should be deprecated in favour of &lt;a href="http://www.relaxng.org/"&gt;RelaxNG&lt;/a&gt;. I haven&amp;#8217;t seen a single person who would claim that XML Schema is better. People just use it because of the W3C label.
&lt;/ol&gt;&lt;/div&gt;</description><link>http://www.tjansen.de/blogen/2003/12/10-things-i-hate-about-xml.html</link><author>Tim Jansen</author></item><item><guid isPermaLink='false'>http://www.blogger.com/feeds/6927046/posts/full/108714908730220746</guid><pubDate>Sun, 13 Jun 2004 16:03:00 +0000</pubDate><atom:updated>2005-10-31T20:14:20.646+01:00</atom:updated><title>Generics with Instance Parameters for XML type safety</title><description>&lt;div xmlns="http://www.w3.org/1999/xhtml"&gt;Until this weekend I was going to allow only classes as parameters for &lt;a href="http://www.tjansen.de/blogen/2004/05/more-generics.html"&gt;generic classes&lt;/a&gt;. But, thinking more about the XML support, I am now leaning towards allowing values as well. The reason is that I want methods to be able to take a specific element as argument. E.g. A method that takes only XSL's for-each element as arguments should be declared as&lt;pre&gt;
class SomeXmlHandler
    namespace xslt "http://www.w3.org/1999/XSL/Transform"
    static handleForEach(Element&amp;lt;xslt:for-each&gt; forEachElement)
        // do something 
end
&lt;/pre&gt;
and possibly with an optimization for elements using only&lt;pre&gt;
class SomeXmlHandler
    namespace xslt "http://www.w3.org/1999/XSL/Transform"
    static handleForEach(&amp;lt;xslt:for-each&gt; forEachElement)
        // do something
end
&lt;/pre&gt;
to allow at least basic type-safe processing. This has some consequences: the syntax for generic class declaration needs to support this. I currently favour prefixing the class parameters with a 'class' keyword. My &lt;a href="http://www.tjansen.de/blogen/2004/05/more-generics.html"&gt;original example&lt;/a&gt; will then look like&lt;pre&gt;
class MyMap&amp;lt;class Object K, class Object V?&gt;
end
&lt;/pre&gt;
Instance parameters look like the old syntax, without 'class' keyword, and can be accessed like constant static properties. Methods can use the instance parameters, constant static properties and all literals for type parametrization in their signatures.

The bigger problem is that it is difficult to retrieve typed elements. In a traditional API this would not be possible without casting. E.g. code like &lt;pre&gt;
test(Element someElement)
        NodeList forEachList = someElement.getChildElements("for-each")
        handleForEach(forEachList[0])
&lt;/pre&gt;
can not compile, because the compiler does not know that forEachList contains only "for-each" elements. The NodeList needs to be typed as well. But even worse is that then the return value of getChildElements() needs to be based on the method's argument, which may not be known at compile time. 

There is a solution though, and it depends on the fact that most child access is done by operators with a literal QName as argument. Thus the compiler can know the QName at compile time, and it would be possible use this information for type safety. It makes the operators a little bit special though. I usually hate magic, but this may be a good cause. The '..' of Element operator would have the signature&lt;pre&gt;
NodeList&amp;lt;name&gt; operator..(QName name)
&lt;/pre&gt;
Note that the class parameter is the parameter of the method. This magic should be limited to operators that take a QName as argument.

With instance parameters, the Element and NodeList signatures are
&lt;pre&gt;
class Element&amp;lt;QName element? = null&gt; extends Node
end
&lt;/pre&gt;
and
&lt;pre&gt;
class NodeList&amp;lt;QName element? = null&gt;
end
&lt;/pre&gt;
If parameters are null they allow any type.

Then the example could be written as&lt;pre&gt;
test(Element someElement)
        NodeList&amp;lt;xsl:for-each&gt; forEachList = someElement..xsl:for-each
        handleForEach(forEachList[0])
&lt;/pre&gt;&lt;/div&gt;</description><link>http://www.tjansen.de/blogen/2004/06/generics-with-instance-parameters-for.html</link><author>Tim Jansen</author></item><item><guid isPermaLink='false'>http://www.blogger.com/feeds/6927046/posts/full/108591528844119805</guid><pubDate>Sun, 30 May 2004 10:42:00 +0000</pubDate><atom:updated>2005-10-31T20:14:08.296+01:00</atom:updated><title>Nullable Types in C#</title><description>&lt;div xmlns="http://www.w3.org/1999/xhtml"&gt;I just read about &lt;a href="http://wesnerm.blogs.com/net_undocumented/2004/05/nullable_types.html"&gt;nullable types in C# 2.0&lt;/a&gt; and was quite surprised how similar it is to Eek's syntax. In C# you write &lt;pre&gt;int? a = 1;&lt;/pre&gt; and in Eek you write &lt;pre&gt;int a? = 1&lt;/pre&gt; to define an nullable variable 'a' with a default value of 1. Eek's implementation will be different though. C# differentiates between value types (like int) and reference types, and in C# 1.0 only reference types could be null. That's why they added the extra feature for value types (which is actually just &lt;a href="http://blogs.msdn.com/ericgu/archive/2004/05/27/143221.aspx"&gt;a short notation for a wrapper class&lt;/a&gt;). In Eek everything is an object, there are no value types, all variables are references, and references are not nullable unless the question mark '?' modifier is used. 

An interesting feature in C#'s syntax is the '??' operator. The statement &lt;pre&gt;int x = a ?? 1;&lt;/pre&gt; takes 'a' if 'a' is not null, and 1 otherwise. This is an nice short cut, and I wonder whether I should provide something like this. Right now Eek's specs contain only two ways of eliminating nulls. Either the 'if/then/else' operator &lt;pre&gt;int x = if a then a else 1&lt;/pre&gt; that allows default values, or the 'any' conversion &lt;pre&gt;int x = any(a)&lt;/pre&gt;, which will fail at runtime if 'a' is null.

I am a little bit concerned about the number of Eek's operators. I want to keep the number as low as possible, to avoid Perl's line noise effects that occur when people are exposed to operators that they have never seen before. Right now Eek has all operators that Java has except the post- and pre- increment and decrement operators ('++' and '--'). The ternary operator 'x ? y : z' will be replaced by 'if x then y else z', because '?' and ':' are used for too many other things and this syntax allows several 'if's with a single 'else'. Additionally Eek has '..' and '.@' for accessing node descandants and attributes in XML trees, and the filter operator '.()'. These three are taken from &lt;a href="http://www.ecma-international.org/news/ECMA%20E4X%20Final%20Final%20Web.htm"&gt;E4X&lt;/a&gt;. And Eek has several extra literals, especially '{someblock}' for closures and 'prefix:name' for XML QNames could have a line-noise effect. So I would really hate to add another operator. On the other hand, I expect null-elemination to be needed frequently.
&lt;b&gt;9 hours later...&lt;/b&gt;
I think I found a solution: I use the '||' operator. This isn't too far from the original behaviour of '||', because every nullable reference in Eek is implictly convertible to bool anyway. If right-hand operand is a non-nullable reference other than bool, the left hand operator will be returned if it is not null and otherwise the right-hand operator. Without the modification it would be forbidden to use any non-nullable right hand operand except bool (and with bool it does not make any difference). Both operands must be compatible with the expected value for the expression, if there is one.
With the modified '||' operator the example can be written as &lt;pre&gt;int x = a || 1&lt;/pre&gt;.
&lt;b&gt;Two days later...&lt;/b&gt;
Ok, bad idea. The precedence of '||' is too low to be usable without parentheses.&lt;/div&gt;</description><link>http://www.tjansen.de/blogen/2004/05/nullable-types-in-c.html</link><author>Tim Jansen</author></item><item><guid isPermaLink='false'>http://www.blogger.com/feeds/6927046/posts/full/108893518535764243</guid><pubDate>Sun, 04 Jul 2004 09:37:00 +0000</pubDate><atom:updated>2005-10-31T20:13:54.633+01:00</atom:updated><title>Eek's Property Syntax</title><description>&lt;div xmlns="http://www.w3.org/1999/xhtml"&gt;This &lt;a href="http://beust.com/weblog/archives/000083.html"&gt;old entry&lt;/a&gt; in Cedric Beust's (recommended) blog describes the differences of the property syntax in Java, C# and Ruby. So let's do it for Eek. There is one thing to say though, Eek does not make any difference between properties and fields. Everything is the same, and called properties.

&lt;b&gt;Read-Write attributes&lt;/b&gt;
Eek knows two variants here. Either you can implement two accessor methods, or you write it like a Java field. In the first case the object does not have any memory associated with it, so you may want to have a private property to store the value. In the second solution, the object will have memory for the property, and accessors for reading and writing are implicitly created. You can still override them later without losing binary compatibility, but you don't have to write them as long as you don't need them. So here's the accessor method version:&lt;pre&gt;
public:
        String firstName.get()
                // some code

        firstName.set(String value)
                // some code
&lt;/pre&gt;
And this is the field syntax version:&lt;pre&gt;
public:
        String firstName
&lt;/pre&gt;

&lt;b&gt;Read-Only Attributes&lt;/b&gt;
Read-only properties can only exist as accessor methods, so there is no short notation. A true read-only field-like property would not make any sense in Eek. And I don't know a good syntax to define a field-like property with different access permissions for reading and writing.. So read-only properties can eiter be created by defining a 'get' method without 'set' method, or by definint the accessor methods with different access permissions:&lt;pre&gt;
public:
        String firstName.get()
                // some code
private:
        firstName.set(String value)
                // some code
&lt;/pre&gt;&lt;/div&gt;</description><link>http://www.tjansen.de/blogen/2004/07/eeks-property-syntax.html</link><author>Tim Jansen</author></item><item><guid isPermaLink='false'>http://www.blogger.com/feeds/6927046/posts/full/109035295340781837</guid><pubDate>Tue, 20 Jul 2004 19:27:00 +0000</pubDate><atom:updated>2005-10-31T20:13:38.950+01:00</atom:updated><title>Eek Status Update</title><description>&lt;div xmlns="http://www.w3.org/1999/xhtml"&gt;&lt;a href="http://www.tjansen.de/blogen/2004/05/bootstrapping-compiler-and-runtime.html"&gt;Two months ago&lt;/a&gt; I predicted that I will need two months to finish the specification and, surprise, I lied :). I am almost through the regular features, roughly comparable to Java's feature set. But the specification is still missing &lt;ul&gt;&lt;li&gt;Generics/class parameters. I have a good idea what they will look like, as shown &lt;a href="http://www.tjansen.de/blogen/2004/06/generics-with-instance-parameters-for.html"&gt;here&lt;/a&gt;, but writing it down will take same time&lt;/li&gt;&lt;li&gt;Exceptions. No problem, but needs to be done.&lt;/li&gt;&lt;li&gt;Constraints. Since my &lt;a href="http://www.tjansen.de/blogen/2004/02/argument-constraints-in-function.html"&gt;entry in february&lt;/a&gt; nothing has changed, I will probably specify them exactly as described in the blog&lt;/li&gt;&lt;li&gt;Enum. Not a real problem.&lt;/li&gt;&lt;li&gt;Annotations (alias attributes in C#). Will be very simple, just a list of Elements in front of classes and class members.&lt;/li&gt;&lt;li&gt;Co-Routines/Generators. Should be easy.&lt;/li&gt;&lt;li&gt;Delegates and Events. Now they are a real problem, actually the problem number 1. I have many different ideas, but am not happy with any of them. They will probably not look like described &lt;a href="http://www.tjansen.de/blogen/2004/01/combining-advantages-of-qt-signalslots.html"&gt;January&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Closures. The base idea is simple, code in curly braces ('{}') returns a delegate. But I don't know how to name arguments, this mostly depends on the missing delegate syntax.&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;</description><link>http://www.tjansen.de/blogen/2004/07/eek-status-update.html</link><author>Tim Jansen</author></item><item><guid isPermaLink='false'>http://www.blogger.com/feeds/6927046/posts/full/109127881953354664</guid><pubDate>Tue, 03 Aug 2004 22:31:00 +0000</pubDate><atom:updated>2005-10-31T20:13:27.603+01:00</atom:updated><title>EXML (Eek's XML)</title><description>&lt;div xmlns="http://www.w3.org/1999/xhtml"&gt;I am talking a lot about XML support in Eek, but I don't actually &lt;b&gt;mean&lt;/b&gt; XML as &lt;a href="http://www.w3.org/XML/"&gt;defined by the W3C&lt;/a&gt;. I am rather talking about a simplified data model that is (mostly) compatible with XML, based on the &lt;a href="http://www.tjansen.de/blogen/2003/12/10-things-i-hate-about-xml.html"&gt;10 Things I Hate About XML&lt;/a&gt; entry from last december. So in order to prevent unnecessary confusion, I decided to call it EXML (Eek's XML). 

&lt;b&gt;EXML Data Model&lt;/b&gt;
EXML is mainly a data model. It's purpose is similar to &lt;a href="http://www.w3.org/TR/xml-infoset/"&gt;XML Infosets&lt;/a&gt; or the &lt;a href="http://www.w3.org/TR/xpath-datamodel/"&gt;XPath 2.0 Data Model&lt;/a&gt;. The data model uses a class hierarchy to describe the entities.
Everything in EXML is a Node. But EXML has only two types of nodes: Elements and Values. 
Elements  are like Elements in XML, they have a name and can contain a sequence of other nodes as children. Additionally an Element may have an unlimited number of unordered attributes, which are named values. Attributes are not Nodes themselves, only their Values are. 
The names of elements and attributes are QNames, thus they have a local name and an optional URI.
Value is an abstract class, the Value is typed and uses sub-classes like String, Int, Float, Date, Blob, QName and so on. In the Eek API the Values subclasses are the Eek's core types, so the int object that you get when you write "int i = 0" inherits from Value and can be put directly in an EXML tree.
Only Elements know their parent. It is not possibly to find out the parent of a Value. EXML does not know a document type. A document is a tree of Nodes, which must have exactly one root element. The root is simply an Element that has no parent. 

&lt;b&gt;EXML Serialization Format&lt;/b&gt;
Serializing EXML is easy: just return it as XML, UTF-8 encoded and without prolog (yes, that's legal XML). Values are printed in their canonical representation, as defined by &lt;a href="http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/"&gt;XML Schema Part 2&lt;/a&gt;. 
Reading it is a little bit more difficult. There are two problems. The first is whitespace. In XML documents whitespace is frequently used to increase the readability, especially for indenting. However, most of that whitespace is not significant at all, makes good XML  applications more complicated and bad ones less reliable. So I feel that a solution is needed, and it goes like this: if an element contains only a value, the value's whitespace is significant. Otherwise, if the element contains mixed content (values and elements), all pure whitespace values are ignored, unless the 'xml:space' attribute is set to 'preserve'. 'xml:space' is part of the regular XML spec, but only serves as a hint. EXML makes its use mandatory.
The second problem with reading are the representation of values. Unfortunately, for whatever reason, XML Schema Part 2 defines a single canonical representation for every type, but requires applications to read a variety of formats. For example the canonical representation of the number 5 as integer is '5', but with XML Schema the app must also read '+0005'. The EXML parser can't know whether the document's author just wanted to use a really complicated way to write 5, or whether the strange notation needs to be kept. So EXML considers all XML text to be Strings, unless it is a canonical representation of at least one other type. Then it takes the 'preferred' type. For example the preferred type  for the number 5 is 'int', and not 'long'. At a later time EXML may get an advanced parser that uses a schema to convert text to the right type, but until then the user can only rely on having the right Value type when the source is a original EXML document with canonical representation for Values. 


&lt;b&gt;Removed Features&lt;/b&gt;
EXML removes the following features from XML:&lt;ul&gt;&lt;li&gt;The &lt;?xml?&gt; prolog at the beginning. There is only one encoding, and that is UTF-8. (If really needed, UTF-16 could be added, but that can be detected at the UTF level)&lt;/li&gt;&lt;li&gt;DTDs and &amp;lt;!DOCTYPE&gt; and all that crap that has long been replaced by schemas&lt;/li&gt;&lt;li&gt;Entities, except the built-ins (&amp;amp;lt;, &amp;amp;amp;, &amp;amp;quot; and &amp;amp;#number;)&lt;/li&gt;&lt;li&gt;Processing instructions&lt;/li&gt;&lt;li&gt;CData sections&lt;/li&gt;&lt;li&gt;xml:lang&lt;/li&gt;&lt;li&gt;Comments in the data model (but not in the serialization format, they are just &lt;b&gt;really&lt;/b&gt; ignored)&lt;/li&gt;&lt;/ul&gt;
I don't think that anybody who's using modern XML infrastructure would miss any of them. And they are what makes XML so complicated. In case someone needs them for obscure reasons, there should be an option that allows representing old stuff like processing instructions as elements in a special namespace. This way all the original XML features can be accessed while keeping the data model simple. But this would be completely optional, because it can break schemas and applications.

&lt;b&gt;XML Compatibility&lt;/b&gt;
All or at least most modern XML standards can still be used with EXML. I hope to be able to use it with SOAP and the other WS standards, XPath, XSLT, XQuery, RelaxNG and (if I can't avoid it) XML Schema. In some cases there may be limitations needed. For example XPath will need minor modifications, because not all EXML nodes can retrieve their parent node - EXML Values can not. But in general this should not be a problem.&lt;/div&gt;</description><link>http://www.tjansen.de/blogen/2004/08/exml-eeks-xml.html</link><author>Tim Jansen</author></item><item><guid isPermaLink='false'>http://www.blogger.com/feeds/6927046/posts/full/10918754003310606</guid><pubDate>Fri, 27 Aug 2004 18:36:00 +0000</pubDate><atom:updated>2005-10-31T20:13:16.603+01:00</atom:updated><title>'yield', generators and coroutines</title><description>&lt;div xmlns="http://www.w3.org/1999/xhtml"&gt;One of Eek's planned features are &lt;a href="http://en.wikipedia.org/wiki/Coroutine"&gt;coroutines&lt;/a&gt;. I have recently discovered &lt;a href="http://www.cerkit.com/cerkitBlog/PermaLink,guid,a8f15827-8d7f-42cd-b3bf-1c918aae26e9.aspx"&gt;this blog entry&lt;/a&gt; that shows the use of the 'yield' statement in C#, and it is quite unusual. C#'s 'yield' statement seems to be limited to implementing functions that return the IEnumerable type (C#'s iterator interface). This makes sense in many cases, as every 'yielding' method will return a sequence of values and the number of values may be limited. But it makes it less useful for some purposes in which it can be used just to move variables from the class into the method. Microsoft's implementation is also quite unusual and uses a state machine, more &lt;a href="http://blogs.msdn.com/matt_pietrek/archive/2004/07/26/197242.aspx"&gt;can be found here&lt;/a&gt;.

The following class shows a co-routine in Eek, the way I want to implement them. The class has a method that adds the argument to the sum of the previous arguments and returns it. 'yield' allows implementing it without any member properties:&lt;pre&gt;
class YieldTest
        int accumulate(int a)
                int sum = 0
                while true
                        sum += a
                        yield sum
end
&lt;/pre&gt;
This is the 'yield' that I want for Eek. It works exactly like 'return' and returns the argument, just on the next invocation on the same instance it executes the command after 'yield' instead of restarting the method. In this case the 'while' loop is restarted after yielding the value. 

The method is a co-routine only because it contains 'yield'. There is no special declaration, as co-routines are just a special case of regular subroutines aka methods.

The coroutine ends when the method returns using the regular 'return' statement. This 'return' must always return a value, like a regular 'return'. If the method does not have any return values (or the return values have default values), it is also possible to end the method by executing the last statement. When a coroutine ends it is not allowed to invoke it anymore. Any attempt will return an exception. 

The state of the coroutine is stored the instance, if the coroutine is an instance method; in the class, if the coroutine is a static method; and in the delegate if the coroutine is a closure. So you can have several 'accumulate()' methods running simultanously, as long as they run in different classes.

Co-routines need, as far as I can see, two limitations. The first one is that it must not be allowed to have a 'yield' in a block with 'finally' section or in a 'using' statement. This is not possible, because they are supposed to guarantee a cleanup, but there is no way to guarantee it. The cleanup can't be done immediately after a 'yield' invocation, because there is no guarantee that the co-routine is called again. And cleaning up after every 'yield' does not make any sense.

The second limitation is that recursion must not be allowed. Recursion in a co-routine would be quite a mess and I would not know how to implement it. Unfortunately, this restriction means that a co-routine must maintain a flag to prevent recursion. This flag needs to be checked when the co-routine is entered to find out whether it is already running. If yes, an exception is thrown. Otherwise the flag is set, the co-routine executed and then the flag cleared. This solution is easy, but causes an performance impact that makes co-routines slower than regular methods. In some cases it may be possible for the compiler to eliminate the check though.

Finally, there is one open point remaining: multi-threading. I am not sure what should happen when two threads try to enter a co-routine simultanously. Because it may mess up the heap, it's probably necessary to prevent this from happening. Unfortunately this requires every co-routine to be locked with a mutex, making it even slower.

To get back to the C# implementation, it has one advantage: it makes it easy to implement an IEnumeration, which is probably a very common case. But I expect that with some simple magic and the help of closures, Eek will still be able to return iterators quite easily. This is what a iterator-returning method could look like:&lt;pre&gt;
class YieldGeneratorTest
        Iterator&lt;int&gt; countTo5()
                // Generator is a Iterator implementation that takes a coroutine closure.
                return Generator({
                        yield 1
                        yield 2
                        yield 3
                        yield 4
                        return 5
                })
end
&lt;/pre&gt;
I think that example is acceptable, and still cleaner than implementing the Iterator directly like C# does.&lt;/div&gt;</description><link>http://www.tjansen.de/blogen/2004/08/yield-generators-and-coroutines.html</link><author>Tim Jansen</author></item><item><guid isPermaLink='false'>http://www.blogger.com/feeds/6927046/posts/full/109407120112806460</guid><pubDate>Fri, 03 Sep 2004 10:18:00 +0000</pubDate><atom:updated>2005-10-31T20:13:05.730+01:00</atom:updated><title>OOP without static/class members</title><description>&lt;div xmlns="http://www.w3.org/1999/xhtml"&gt;In the last month, the progress on Eek was really slow. But I am about to do a major change in the language specification. Until now Eek supported static method and property members in classes, like all other OOP languages that I know. I never liked having two different kind of members, and it also makes the specification more complicated. 

This week I found an equivalent feature that allows me to get rid of static members and is even simpler: 'singleton' classes. 'singleton' is a special kind of class and is declared exactly like a 'class' or 'interface', just with the 'singleton' keyword. The difference between a 'class' and a 'singleton' is that there is never more than one instance of the singleton. You can retrieve a reference to the instance by using the class name like a variable in an expression. Thus the syntax for accessing a singleton's member is exactly like using a static member. Singletons can have only one (optional) constructor which must not have any arguments. When the instance is requested for the first time, this constructor is called, so it also replaces the need for class constructors. The following code shows a simple counter as singleton:&lt;pre&gt;
singleton Counter
        int value

        int increase()
                return value += 1
end
&lt;/pre&gt;
The members are accessed like static members in Java:&lt;pre&gt;
Counter.increase()
Console.println("The counter value is now {Counter.value}.")
&lt;/pre&gt;

Singletons can do everything that static members can do. The next example shows how to implement a simple database of books with static members:&lt;pre&gt;
class Book
        int isbn
        String name

        static Book(int isbn, String name)
                if bookDb.contains(isbn)
                        return bookDb[isbn]
                Book b(isbn: isbn, name: name)
                bookDb[isbn] = b
                return b
private:
        static Map&lt;int, Book&gt; bookDb()
end
&lt;/pre&gt;
The example has two static elements: a static constructor and a static property 'bookDb'. Static constructor will still exist with singletons, I just rename them to 'factory'. They have the invocation syntax of regular constructors, but work like static methods. 

Now here is the example with a 'singleton' and using the 'factory' keyword:&lt;pre&gt;
class Book
        int isbn
        String name

        factory Book(int isbn, String name)
                if Db.books.contains(isbn)
                        return Db.books[isbn]
                Book b(isbn: isbn, name: name)
                Db.books[isbn] = b
                return b
private:
        singleton Db
                Map&lt;int, Book&gt; books()
end
&lt;/pre&gt;
The code is a little bit longer, because the singleton members are prefixed with the class name. If this should really become a problem, I can still allow accessing members of inner class singletons without this qualification. But right now I hope that it is not necessary.

The 'factory' also has a problem: it does have an instance and thus 'this' would not valid. This is exactly what I wanted to avoid when removing static methods. So I need a small trick: the factory methods of a class will be put into some (anonymous) dummy singleton inner class. So 'this' will be valid object, just a useless one.

I am thinking about the 'singleton' concept since monday, and I am quite happy about it. It will take some work to get it into the specification, but i hope that it will simplify the specs in some places.&lt;/div&gt;</description><link>http://www.tjansen.de/blogen/2004/09/oop-without-staticclass-members.html</link><author>Tim Jansen</author></item><item><guid isPermaLink='false'>http://www.blogger.com/feeds/6927046/posts/full/109439746529312585</guid><pubDate>Sun, 05 Sep 2004 15:10:00 +0000</pubDate><atom:updated>2005-10-31T20:12:56.366+01:00</atom:updated><title>Transition to Mediawiki</title><description>&lt;div xmlns="http://www.w3.org/1999/xhtml"&gt;I have decided to move the Eek documentation from &lt;a href="http://moin.sourceforge.net/"&gt;MoinMoin&lt;/a&gt; to &lt;a href="http://wikipedia.sourceforge.net/"&gt;Mediawiki&lt;/a&gt;. I was always frustrated by MoinMoin's lack of features, and as I wrote many &lt;a href="http://www.wikipedia.org"&gt;Wikipedia&lt;/a&gt; contributions in the last weeks, I got so used to Mediawiki's syntax that I do not want to miss it any longer. The installation was easier than expected, and now I have a fresh Mediawiki running. The next step will be to port the MoinMoin content to the other syntax, get rid of the &lt;a href="http://en.wikipedia.org/wiki/CamelCase"&gt;CamelCase&lt;/a&gt; and so on. When doing this revision, I am also going to get rid of static members and to add generics/class parameters.&lt;/div&gt;</description><link>http://www.tjansen.de/blogen/2004/09/transition-to-mediawiki.html</link><author>Tim Jansen</author></item><item><guid isPermaLink='false'>http://www.blogger.com/feeds/6927046/posts/full/109492203653244449</guid><pubDate>Sat, 11 Sep 2004 16:47:00 +0000</pubDate><atom:updated>2005-10-31T20:12:47.790+01:00</atom:updated><title>Simplifying Access Controls</title><description>&lt;div xmlns="http://www.w3.org/1999/xhtml"&gt;After getting rid of 'static' members, I am currently watching for other C++/Java legacies that I could get rid of. And I found access controls. Until today, Eek used a C++ label syntax with C++'s 'private/protected/public' modes and additionally 'protected api' and 'public api'. The new modes were needed to make a member accessible to other &lt;a href="http://en.wikipedia.org/wiki/C_Sharp_programming_language#Code_libraries"&gt;assemblies&lt;/a&gt;. This became much too complicated, but because I think that the 'api' modes are important, I decided to drop the rarely used 'protected' modes. 'protected' can be useful sometimes, but most of the time it is just bad API design that tries to combine extensibility and real functionality into a single interface. The remaining modes were 'private', 'public' and 'api'. Because 'public' was already default, I switched from a C++ label system to a Java modifier system and got rid of 'public' as well. A member without a modifier is 'public', and the remaining access control modifiers are 'private' and 'api'.&lt;/div&gt;</description><link>http://www.tjansen.de/blogen/2004/09/simplifying-access-controls.html</link><author>Tim Jansen</author></item><item><guid isPermaLink='false'>http://www.blogger.com/feeds/6927046/posts/full/110346269759720582</guid><pubDate>Sun, 19 Dec 2004 12:57:00 +0000</pubDate><atom:updated>2005-10-31T20:12:32.480+01:00</atom:updated><title>Progress...</title><description>&lt;div xmlns="http://www.w3.org/1999/xhtml"&gt;Everything is going well and I am confident that I will finish the specification of the language this year. Right now I rather work on the spec than blog about the changes, as there are quite many. But here are a few examples of the new syntax.

&lt;b&gt;Method Syntax&lt;/b&gt;
I have changed the message syntax from C/Java-style to Pascal-like signatures. This solved most of my problems with the delegate syntax and is more logical anyway. Why should the name of a method be between the output and the input types? Here are two ways of writing a method with the new syntax:&lt;pre&gt;
add(int a, int b): int
        return a + b

add(int a, int b): (int r)
        r = a + b

&lt;/pre&gt;

&lt;b&gt;Delegates&lt;/b&gt;
Now the syntax of a delegate type becomes easy, just use the 'delegate' keyword instead of the method name. I have abandoned my &lt;a href="http://www.tjansen.de/blogen/2004/01/combining-advantages-of-qt-signalslots.html"&gt;original idea&lt;/a&gt; which required giving delegates names - it's often hardly possible to find a good name for a delegate. But if the programmer wants it, she is free to create an alias for a delegate, which does exactly the same.&lt;pre&gt;
delegate(int,int):int myDelegate = add
&lt;/pre&gt;

&lt;b&gt;Closures&lt;/b&gt;
I have two closure variants, one for embedded expressions and one for multiline closures. Both can be declared either without argument specification (for closures that use up to one argument) or with full argument specification:&lt;pre&gt;
// Embedded closure, full prototype
delegate(int,int):int myDelegate1 = closure(int x,int y):int {x + y}

// Multiline closure, full prototype
delegate(int,int):int myDelegate2 = closure(int x,int y):int {{
        return x + y
}}

// Embedded closure, simplified ('it' is a special variable 
// that contains the first argument)
delegate(int):int myDelegate3 = { it * it }

// Multiline closure, simplified ('it' is a special variable 
// that contains the first argument)
delegate(int):int myDelegate3 = {{
         return it * it 
}}
&lt;/pre&gt;

&lt;b&gt;Generics&lt;/b&gt;
The generics syntax looks a bit like a mixture of Java's and C++'s, but it works more like Java's. Both classes and methods can contain generic variables. Let's start with a class:&lt;pre&gt;
class Stack&amp;lt;Object T&gt;
        private Array&amp;lt;T&gt; mArray

        push(T obj)
                // ...

        pop(): T
                // ...
end
&lt;/pre&gt;
Here's a method:&lt;pre&gt;
add&amp;lt;Number T&gt;(T a, T b): T
        return a + b
&lt;/pre&gt;
and here a delegate and a closure:&lt;pre&gt;
delegate&amp;lt;Number T&gt;(T,T):T myGenericDelegate = closure&amp;lt;Number T&gt;(T a,T b):T { a + b } 
&lt;/pre&gt;

&lt;b&gt;Events&lt;/b&gt;
And finally the event syntax, which is exactly the same as the &lt;a href="http://www.tjansen.de/blogen/2004/01/combining-advantages-of-qt-signalslots.html"&gt;syntax that I favored a year ago&lt;/a&gt;, just with the new method signature:&lt;pre&gt;
        event pointerClicked(int x, int y, int button)
&lt;/pre&gt;&lt;/div&gt;</description><link>http://www.tjansen.de/blogen/2004/12/progress.html</link><author>Tim Jansen</author></item></channel></rss>
