Monday, September 6, 2010

HyperJS Episode 5 - The Prototype Strikes Back

Prototypes and this Are Just Awesome -- in JavaScript

Posts in the HyperJS Series

Last time, I discussed the fourth step on the road to JavaScript in C# in my HyperJS Episode 4 - A New Hope post. It covers the creation of JSObject that returns undefined for members not yet defined, the skeleton implementation of String and Boolean classes as functions, including scripts using "using", and the Unit Tests that prove its working. All of the projects I talk about in this series are available on my GitHub repos: http://github.com/tchype.

In part 5, we follow the typical story arc and hit the dark second act. Now that I have some basics and a general framework prototype in place, let's see what happens with functions, prototypes, and this.

HyperJS Part 5: Functions Should Be Objects, Dammit!

Notice that in String and Boolean, toString and valueOf are both instance methods. I would like them to be attached to the prototype instead of recreated each instance. I then want to be able to change the impelementation or add more members to the prototype and see the existing objects take on that functionality. Let's see what that would take...

Prototype Instances

Even though I had solved delegating property access/method calls to parent objects with HyperHypo, there's something distinctly different about actual JavaScript's prototypes. Prototypes are actually references to the functions that are the constructors for the objects. In JavaScript, you can dynamically add properties and methods onto the constructor function object! These end up being similar to static objects on a type, but with a BIG difference: references to "this" inside of those functions actually resolve at execution time to the instance of the object that the constructor function created when it was called with new, even though it actually resides on the prototype (static)!

Here's a JavaScript example, based on the Person class I defined in my 3rd post. I had defined:

  • private instance variables _firstName and _lastName
  • public instance getter and setter variables getFirstName(), getLastName(), setFirstName(), and setLastName()
  • and a toString() method that concatenated the _firstName and _lastName private variables together with a space in between.
There's really no reason we need to have the toString() method be an instance method on Person. In the example as coded, it does use the closure of the instance constructor to access the private variables, but we could create one static-like function that can be shared across all instances of Person:

<script type="text/javascript">
  function Person(firstName, lastName) {
    var _firstName = firstName;
    var _lastName = lastName;

    this.getFirstName = function() { return _firstName; };
    this.getLastName = function() { return _lastName; };
    this.setFirstName = function(value) { _firstName = value; };
    this.setLastName = function(value) { _lastName = value; };
  }
  // Outside the constructor function definition 
  Person.prototype.toString = function() { 
    return this.getFirstName() + " " + this.getLastName();
  };
  ...
  var me = new Person("Tony", "Heupel")
  var singer = new Person();
  singer.setFirstName("Paul");
  singer.setLastName("Hewson");
  alert(me.toString());     // Outputs "Tony Heupel"
  alert(singer); // toString() called automatically and outputs "Paul Hewson" (but should have made it say Bono, somehow)
</script>

Again, even though toString() is now only one function that is shared across all instances of Person, the magic of "this" allows it to act like an instance function that works against the public members of a Person instance and appear as if it were an instance function. Some key things about JavaScript prototypes:

  1. Prototype itself is actually the "static member" in that it is a single object of type Object (unless you specify otherwise) that applies to all objects of that type
  2. When you add members to prototype, you're actually adding them to an instance of an object that is assigned to prototype, and that then allows all instances of the object type it is attached to to appear to have that function--even instances of objects already created previously!
  3. In addition to adding or changing existing members on a prototype of an object, you can also totally replace the prototype object instance of another class (by default, all have a prototype of type Object, but you can change that at any time to create a "subclass" of another object)
  4. Again, the magic of "this" cannot be overstressed...

Even though HyperHypo supports a Prototype member, it just points to an instance of an object. This works fine in many cases, but what if you want your class to inherit from another class that is not JSObject? You would simply assign a new value to the Prototype member, right? Not quite...Prototype is an instance member and cannot be made static or all instances of JSObject would share the same prototype. And if you just change it on one instance, then the others do not see the new functionality either. So, I made three key changes:

  1. JSObject has a JSTypeName: This is just a string that in regular class-oriented programming could be replaced with this.GetType().Name, but in this style just needs to be set when a new instance is created.
  2. The JS instance has a Prototypes dictionary: To ensure that all new instances of a particular class share the same Prototype object, the first time an object of that type is created, it creates it's prototype and adds it to the global object's Prototypes dictionary with a key of the JSTypeName value; any subsequent creation of an object of that type will use the same Prototype object from the dictionary to ensure that we can add members to the prototype and all instances of that type get the functionality.
  3. JSObject gets notified when a prototype for a type is assigned a new object instance: To inherit from something different, we could--on the fly--say something like JS.cs.Prototypes[JS.cs.String(null).JSTypeName] = someNewObjectType;. While this would change the base object for any new instances created, it doesn't update any old ones, since Prototype is a reference to an object, not a getter that looks at the global Prototypes dictionary. Rather than overriding HyperHypo's Prototype property getter and setter, I simply had every JSObject get notified when this changes and it sees if the change was to the prototype key for that object's JSTypeName.

How Well Does That Work?

I just-so-happen to have some unit tests that help prove what this looks like and how well it works:

[TestMethod]
public void PrototypeFun()
{
    dynamic s = JS.cs.NewString("hello");
    dynamic thing = new JSObject();

    thing.Prototype.bizbuzz = new Func<string, string>((name) => "hello, " + name);
    s.Prototype.foobar = new Func<bool>(() => true);  // Check it doesn't inadvertantly set ALL Prototypes from root object to have foobar

    // bizbuzz method available on all objects
    Assert.AreEqual("hello, tony", s.bizbuzz("tony"));
    Assert.AreEqual("hello, tony", thing.bizbuzz("tony"));

    // foobar set oon string prototype, but not exposed to object
    Assert.IsTrue(s.foobar()); 
    Assert.IsFalse(thing.foobar); // Feature detection -- not set on object prototype


    Assert.AreEqual(3, thing.Count); //foobar should not show up on JSObject's prototype....
    Assert.AreEqual(4, s.Count); // foobar and bizbuzz are available to previously created string instances
            
    // Create new string prototype and set it
    dynamic newPrototype = new JSObject();
    newPrototype.skunkWorks = "skunky!";
    s.SetPrototype(newPrototype);  // skunkWorks now available on string

    Assert.AreEqual(3, thing.Count);  // Updating string's prototype shouldn't mess with Object
    Assert.IsFalse(thing.skunkWorks); // Feature detection - make sure skunkWorks not on object
    Assert.AreEqual(4, s.Count); // toString, valueOf, skunkWorks, and bizbuzz (no foobar)
    Assert.AreEqual("skunky!", s.skunkWorks); // Can access it through string instance previously created
    Assert.IsFalse(s.foobar);  // Feature detection - since prototype was changed, no longer there!
}

As you can see in the tests above, you can add members to prototypes and have "sub objects" pick up the functionality, in very basic cases. It turns out that this approach fails in the most critical case: adding methods to the prototype that act as instance methods! The other cases that work properly are:

  • You are adding a member property to the prototype
  • You are adding a function that is actually a static function and does not require any knowlege of an instance (e.g., Date.getDate())

How to Implement Instance-Like Methods on Prototypes

How interesting that one of the things I thought was solved early-on in HyperDictionary is actually coming back to be a serious issue. Next, I went down the route of trying to figure out how to implement instance-like methods on the prototype for String and Boolean. In both cases, the only obvious thing to do was to create a static method that takes a "this" or "self" as the first argument and add it to the prototype, and then create instance methods that pass "self" into the prototype method. This works at first but is dumb. Most of these functions are small, so creating an instance wrapper and a prototype version is just a waste of space. Additionally, you still can't just add a method to the prototype that looks like an instance method, and adding new instance wrappers to existing object instances is dumb as well.

Sure, I could hook into my Prototype update mechanism, but that was already a hack around the fact that functions aren't objects and you can't attach properties to them. I even investigated trying to subclass a delegate (System.MulticastDelegate) in a JSObject, but I knew that would fail since you have to be a special language service/implementor in order to able to do anything to System.Delegate or System.MulticastDelegate. If only I had the flexible "this" available to me...

A Look at call() and apply()

One of the ways you can control the "this" scoping in JavaScript is to use the call() or apply() functions on a function object. They essentially let you call the function, but specify the "this" of said function:

// Assume Person has a saySomething method on its prototype:
Person.prototype.saySomething(wordsToSay) { 
  return this.getFirstName() + " said '" + wordsToSay.join(' ') + "'";
}

var p = new Person("Tony", "Heupel");
// The lines below is the same as caling p.saySomething("Hello", "There");
// and both return "Tony said, 'Hello There'"
Person.prototype.saySomething.call(p, "Hello", "There"); 
Person.prototype.saySomething.apply(p, ["Hello", "There"]);

Perhaps the answer lies somewhere in there. Maybe I could update or sublclass JSObject to make a JSFunction where it is a dynamic object that has a call() and/or apply() method that calls the function. But then how to make the syntax of calling any function (not just a wrapped constructor function) look like o.foobar("biz") reasonably flexibly and not require a ton of work on the object/function author to make it work.

Hello, Python?

Then, something suddenly struck me: oh, crap. This is all ALMOST EXACTLY how Python works! Python objects, including classes (yes, classes are objects themselves), instances of classes, and functions (yes, functions are first-class objects as well) all have a Dictionary of name/value pairs (object.__dict__) at each level (instance, class, metaclass, etc.), you can use indexers to set attribute names and then access them with dot-notation, functions are objects with a __call__ attribute and methods take self as the first argument, and so on...

The most disappointing thing: OK, I could do this, but then I'll need an interpretor/compiler in order to get the syntax to stay reasonably similar to real JavaScript, and I'm trying to avoid that like crazy. Additionally, Python already runs on every platform--including the .NET CLR/DLR--and has a full ecosystem and 20 years of work behind it. ARGH!

A Reasonable Prototype Outcome

When I started the actual HyperJS portion of this work, I was very specific in my goals that I thought would prove it to be a good idea or doomed from the outset. Heading down this path is definitely a possibility and I am proud that I set up my goals such that I could find this early--within 2 weeks of starting out on this crazy journey in my spare time.

I'm not personally willing to head down this path any further at this time. This is the dark "second act" of the HyperJS story. It may never emerge into the triumphant "third act" (Return of the HyperJS Jedi, or something); or maybe, because it's Open Source and on GitHub, someone will find it and want to keep going with it before I decide to. That would be awesome if this project has a real useful future that I'm holding up by not pursuing it for now...

In the meantime, I'll probably dive back into Python and Ruby (I found the Pickaxe book at Goodwill for $2, like new!) for established solutions, and Node.js and Jaxer for up-and-comer use of JavaScript on the server. That just seems like a better use of my time (and my wife will be happy that I'm finally done being "almost done" with all of this "spare time" investigation).

Thanks for going on this ride with me. I hope it was well worth the read and a good brain-stretcher and maybe plants seeds of ideas into your head. Remember, even though HyperJS has stalled, HyperCore still has some really cool stuff in it that I think is useful. If you find it useful and plan to use it, let me know!