Empowered data
This proposal tries not to add new entities into the language, it only tries to take what already is there and reusing it mercilessly. Also number of abstractions is lowered, since some of them could be implemented with existing ones, with minimal changes. The result is compact, lesser, more uniform and much more powerful language. ;-) Forward-compatible.
Motivation.
Blocks of code enclosed in curly braces were of two natures in ES3 and ES5 - there were code blocks, containing sequence of instructions to perform, and there was the object literal, which contained recipe for building a structured piece of data.
ES.next introduced some powerful additions to the object literal, introduced
new use for it (.{...}
and <| {...}
operations)
and brought in a new type of {...} block - the class block. The class block
borrows many new features of object literal, but itself is something in-between.
Driven by the feeling that having more types of {...} source code constructs brings more confusion led to the thoughts about their nature and their similarities. This proposal wants to take this train of thought to the extreme by proposing only two types of "brave new curly block" constructs with strict role split - the imperative one for control flow and the declarative one for data structures, while building on similarities between them and radically empower the declarative one in the process, and not losing forward compatibility (by this I mean old-style constructs work in the new proposal). Classes are made as one case of the declarative construct with class-specific extensions, changing existing class syntax very slightly and not losing any semantics.
Curly blocks are similar. Reuse for much power with few features.
A simple code block:
{
x = 4;
receiver.f(x);
function g() {
do { nothing; } while (false);
}
x++;
const prop = 5;
if (x>5) {
process.exit();
}
x--
}
Basic structure of this block is: there are simple statements, which are terminated by a semicolon (ignoring general semicolon insertion here). The last simple statement in a code block does not need a semicolon, though it can have it. These statements include assignment and function call.
Then there are structured statements which do not need to be ended with semicolon (do-while is a nasty exception), since they are ended with a sub-block. These are if/else, while loops, function declarations etc.
Not that this is correct explanation of code block structure
(I for example ignore cases where if/else/while/... sub-statements
are simple statments, not sub-blocks. For now, let us assume there
are always sub-blocks). I also intentionally dismissed variable
declarations, since they are not needed for this topic and would make
things a little more complicated (look at that const
line
as an assignment ;-) ).
Now, for the simple object literal (with ES.next extensions):
{
x: 4,
g() {
do { nothing; } while (false);
}
y: { foo: "bar" },
get prop () { return 5; }
z: 0
}
Basic structure of this block is: there are simple productions, which are terminated by a colon. The last simple production in a literal block does not need a colon, though it can have it. These productions are property initializations.
Then there are enhanced productions which do not need to be ended with colon, since they are ended with a sub-block. These are get, set and method declarations.
Even when the range of possible building elements of object literal is smaller
than that of the code block, the similarities can be seen pretty well. There is
undoubtful similarity between x = 4;
and
x: 4,
, not only syntactical, but semantical, too. There is
strong syntactical similarity between declaration of function g
in code block and method g
in the literal. Semantically it is also
pretty similar, though not as much as the previous case.
Previous examples showed that there are formally (simple-simple, structured-enhanced), syntactical and functionally similar pairs of constructs between code block and object literal. These elements are, more-or-less, about the same thing. The difference between them is given by the context: assignment and function declaration do actions (they are imperative), field specification and method specification produce data (they are declarative).
It can be said, with lot of grains of salt, that code block is "(ordered) collection of imperative elements, simple, semicolon delimited, as well as structured, undelimited" and object literal is "(unordered) collection of declarative elements, simple, colon delimited, as well as structured, undelimited", but matching elements appear in both. This strawman is about completing this element similarity, mainly drawing from useful code elements and bringing their counterpart to the data domain.
1. if
& Co. Conditional data structures.
The first idea to borrow from code domain is the if
statement - in this case, not a statement, but a data production.
You surely had a situation when writing an object literal and
wanted to have a field or two only when specific condition is met.
The solution nowadays is either not put it in and add it afterwards
with if
statement in the code (which is not correct,
a conditional data field was wanted, not a conditional action
that assigns to that field) or put the field in with ?:
or &&
operators, so the field has null
value in the case it should not be there at all.
Why not to have something like this?
{
x: 4,
g() {
do { nothing; } while (false);
}
y: { foo: "bar" },
if (bar > cowboy) { jar: ["whiskey"], wall: ["bottle", "bottle"] }
get prop () { return 5; }
z: 0
}
The data-domain if
, in accordance with its code counterpart,
is the structured element, that does not need a colon at the end, since it ends
with a sub-block. But the data-if
governs data-block. The curly
block that is guarded by data-if
should be a normal data-block
which is included if the condition is met, and is not included
when the condition is not met.
Of course we can have if
/else if
/else
combination, like in { name: name, if (age > 60)
{ retired: true } else if (age < 18) { minor: true }
else { workplace: company } age: age }
.
If the if
/else
could only govern (data) blocks,
it would not be the true compilation of code-if
. To be true,
it should take both simple elements ended by comma as well as blocks
into its syntax, so this should be possible, too: { name: name,
if (age > 60) retired: true, else if (age < 18)
minor: true, else workplace: company, age: age }
.
To not create inconsistencies,
I would allow this syntax, as well. Be as true to code-if
as possible. In one line, this may look inferior, but when indented, it can be
{
name: name,
if (age > 60) retired: true,
else if (age < 18) minor: true,
else workplace: company,
age: age
}
{
name: name,
if (age > 60)
retired: true,
else if (age < 18)
minor: true,
else
workplace: company,
age: age
}
Another conditional that can readily be adopted into the data domain is
switch
. It's fall-throgh, implicit block, break-finished
semantic is a bit unwieldy for a one-liner, like { name: name,
switch (role) { case "manager": canSeeReports: true,
case "admin": aceessToServerRoom: true, break, case "developer":
accessToLibrary: true, default: needsTask: true }
if (boss) reportsTo: boss }
, but again, formatting helps, and
frankly, switch
is not used that often in the code, it won't be
used that much in data either, but sometimes it is really helpful. For the sake
of completeness, it should be in the data, as well.
2. f(x, y)
. Data-production macros.
Code has a function call amongst its "simple" building blocks. It allows to define a little piece of code in one place and issue it later in many other places, possibly parametrized. Why not to have something like that in data, too? What about these data productions?
{
name("Doe", "John"),
people.counter(),
position: "manager",
salary: 100000
}
{
name("White Daemon", "Jinx Perry"),
dogs.counter(),
race: "cavalier King Charles spaniel",
colors: [ white, brown ]
}
What are name(...)
and repositorty.counter()
, function
calls? Not exactly - in code it would be calls to functions or methods
that would do some imperative sequence of actions. In data, it "invokes"
a named data production, which is just like function or a method, but
its block is declarative. Otherwise, they are defined the same way as
functions or methods, with exception of % character used as a modifier,
analogically with * modifier of generators:
function% name(surname, givenNames) {
fullname: (locale == "hu" || locale == "jp") ?
surname+" "+givenNames : givenNames+" "+surname,
catalogName: surname+", "+givenNames,
givenNames: givenNames,
surname: surname
}
class Repository {
...
%counter() { id: this.maxId++, creationDate: Date.now() }
...
}
I call % functions and methods data-production macros. They are not
in fact true functions - the semantics of dogs.counter()
is to include id: dogs.maxId++, creationDate: Date.now()
in the object literal. The semantics is this for a reason - so implementors
can optimize it to any level they see fit. It is "just" an inclusion of
a parameterized preready data production.
On the other hand, dynamics of true functions / methods and easy interoperability with code must be present, macros must be as flexible as code functions are. For this, I'd propose these rules:
- macro is first-class object that is accessible by its property name for reading and writing (if not made const etc.)
- you can create macro object inline by
function% (args) { macro body }
- typeof macro is
"function"
, it has no[[Construct]]
and behaviour of[[Call]]
is deliberately undefined (to allow implementors freedom to use it as they see fit) - issuing non-macro object with typeof
"function"
from inside data block as a macro results in throwingTypeError
As for the [[Call]]
implementation specific,
how do you reuse a macro from inside code? Simply:
obj.{ macro(...) }
. This is officially recommended
(and only supported) way of reusing macro directly from code.
And yes, you can have recursion with macros. You are encouraged to.
One more note: macros can be even more powerful if they cleverly use the
[expr]: expr
data production. It is part of ES.next-enhanced
object literal. The word is "cleverly", it can be colossally abused.
You have been warned.
No loops, no variables. "Functional" object production.
There can be two paths with continuing the approach above. One is to adopt everything, however imperative, which is possible, from the code side to the data side, so we can have variables and loops in data side, as well and can issue something like this:
{
operation: "square",
min: 1,
max: 10,
for (var i = this.min; i <= this.max; i++) { [i]: i*i }
}
if
is conditional descriptive; macro, even if powerful
through recursion, is less imperative than loop and variable), you can
as well do it in plain code. After all, code is better for imperative things:
var result = {
operation: "square",
min: 1,
max: 10
}
for (var i = result.min; i <= result.max; i++) result.{ [i]: i*i };
result[i] = i*i;
for purpose of genericity: you can issue loops in code but still use all
of the power of enhanced descriptive blocks using
.{ data-production... }
construct.
If "side-effect" imperative things like variable, and, consequently, loops, were exempt from data-production blocks (and nothing other which is imperative in nature is added later; and all things that would be added would be "side-effect-free" and non-imperative), we will end up with a thing I'd call "functional data production". I think it is desirable trait of a data-production.
By "functional" I now mean the trait that is inherent to code in functional languages - if issued, with parameters, it produces value from them, but this value production has no side-effects. The most prominent of these side-effects is setting a value of a variable. One may also call this "stateless". Data production should be stateless, imperative code is one that should be stateful.
Being stateless (of course, the data production is not
stateless in strict sense - the values are computed by stateful
code expressions, and [expr]: expr
can bring expressions in keys
as well; but avoiding variables and loops makes data production still
less stateful) allows doing things that are typical for functional code (various
behind-the-scene optimizations, mainly; but also some proofs
of correctness) for the data -production blocks. Since data production
is descriptive thing, one almost naturally expects from it to be sort-of
"producing a value" instead of "start a process of manufacturing a value".
Though I can not give a convicing case for this, I beilieve it is Good Thing
(tm) to let the data production be stateless.
In the long run it will bring its fruit.
Parsing: ambiguities; syntax as opt-in philosophy.
This and lots of similar extensions are in some time questioned by the parsing problems. For example:
{
if (typeof window === "undefined") { server: true }
else { broswer: true }
}
The condensed example of this phenomenon is:
{}.f()
f
method of {}
object.
This untreatable ambiguity may render any proposals as this doomed.
But it is not that. Even plain {}
does not work - and we got
used to put parentheses around it whenever it appears at the beginning
of an expression statement (it is not so common to start an expression
statement with object literal, but when it happens, almost always dot
is following and it produces early syntax error). So this is annoying,
but already known phenomenon, and we learn to live with it. Bottom line is,
it is orthogonal to this proposal.
One possible parsing problem is combination of method declaration
(f(args) {body)
) with macro invoking
(f(args)
). But hopefully there will not be a problem, because
the latter needs a comma delimiter unless last in the block.
One paragraph for "syntax as opt-in" mindset, which seems to be part of ES.next.
Conditionals and/or macro calls inside data production block are to be treated
as ES.next syntax and, consequently, opt-it in. The same is the case
of function%
and %
-prefixed method names.
The question of scope of opt-in is still debated, but overall, this proposal
seems to favor program-wide opt-in. It needs the review of others to see
full consequences for "syntax as opt-in" if this proposal is considered.
It brings some (not breaking) changes to the basic ECMAScript matter, that is,
to the object literal. Also, if there were parsing guesses based on containing
if
, switch
or function call, they are invalidated.
3. Class is glorified declaration of prototype.
No offense meant. One of the motivation behind all this was the fact that class block was neither imperative nor declarative but (at least syntactically) something from both, and by need of having only two kinds of {...} - imperative (with all its consequences and common functionality all over) and declarative (ditto). And as I see it (I hope I am not alone), class is a way to describe the prototype (and constructor at the same time, but it is already nicely integrated). So taking example from class proposal (comments shortened; private changed to @ use, see below),
class Monster {
// The contextual keyword "constructor" ... defines the body
// of the class’s constructor function.
constructor(name, health) {
public name = name;
@health = health;
}
// An identifier followed by an argument list and body defines a method.
attack(target) {
log('The monster attacks ' + target);
}
// The contextual keyword "get" followed by an identifier and
// a curly body defines a getter in the same way that "get"
// defines one in an object literal.
get isAlive() {
return @health > 0;
}
// Likewise, "set" can be used to define setters.
set health(value) {
if (value < 0) {
throw new Error('Health must be non-negative.')
}
@health = value
}
// After a "public" modifier, an identifier ... declares a prototype
// property and initializes it
public numAttacks = 0;
// After a "public" modifier, the keyword "const" followed by an identifier
// and an initializer declares a constant prototype property.
public const attackMessage = 'The monster hits you!';
}
class Monster {
// A method defined with name "constructor" is processed specially:
// tt _has_ [[Construct]] and is made a constructor of this class.
// If not explicitly generated, empty one is provided.
constructor(name, health) {
public name = name;
@health = health;
}
// A method, as in every object literal.
attack(target) {
log('The monster attacks ' + target);
}
// A getter, as in every object literal.
get isAlive() {
return`@health > 0;
}
// A setter, as in every object literal.
set health(value) {
if (value < 0) {
throw new Error('Health must be non-negative.')
}
@health = value
}
// A property definition, as in every object literal.
numAttacks: 0,
// A "const" property definition, as in every object literal.
// (syntax of const property production is not yet agreed upon,
// just use any one which is selected in the end)
attackMessage := 'The monster hits you!'
}
Note to private removal: It seems private keyword will be removed in favor of foo.@bar syntax to access foo's property with private name bar. I am embracing this syntax in the document.
Apart from the different comments, which just show the different implementation
provide semantically same result, the class code is nearly identical. Gone is
(superfluous) public
keyword in context of the prototype, I'd say
it could go from constructor method as well
(this.name = name;
works fine and does not create
any exceptional situations for constructor/non-constructor).
If you see at it, the class block really only did (declaratively) describe
the prototype. So let us make
class Clazz [extends Superclazz]
an operator
on the generic data-production block, which creates the class machinery from it
and returns constuctor function. It can be de-facto desugared to something like:
var _proto = (Superclazz || Object).prototype <| {
... the class body ...
};
if (!_proto.constructor) { _proto.{ constructor() {} } }
var _ctr = _proto.constructor;
__allowConstruct__(_ctr);
_ctr.prototype = _proto;
return _ctr;
__allowConstruct__
will be inherent,
not issued afterwards. Pros are clearly visible: less kinds of abstraction,
no management of making features in class and object literal work
consistently (class declaration is an object literal, everything works
automatically).
There are some open issues, definitely. The class proposal continues with this:
class Monster {
// "static" places the property on the constructor.
static allMonsters = [];
// "public" declares on the prototype.
public numAttacks = 0;
// Although "public" is not required for prototype methods,
// "static" is required for constructor methods
static numMonsters() { return Monster.allMonsters.length; }
}
class Monster {
// "static" places the property on the constructor.
static allMonsters: [],
// plain declares on the prototype.
numAttacks: 0,
// "static" is required for constructor methods
static numMonsters() { return Monster.allMonsters.length; }
}
static
keyword if used
in context of class
operator. Yes, an exception, but
pretty clear one. We can live with it. The question appears: "What about
static
in macros?", which is not really easy to answer.
One possibility may be to allow it (and any use of static
)
and throw an error if it is not (directly or included) happening inside
class
operator.
To end this paragraph more positively, if you define class block to be
a data-production block, you can make the language more cohesive and
features reused instead of coordinated, which should be a plus.
Also adoption should be less fearful, because you do not any
"class magic", you are simply "declaring the structure of a prototype"
(while constructor
and static
are taken care by class
operator for you).
Classes + macros = free trait-based composition.
Obvious sexy freebie. Put traits into macros (you can create middlemen by another macros importing and glueing some of them), and then use them in class production.
function% Pointish() {
get r() { return Math.sqrt(this.x*this.x, this.y*this.y); }
get phi() { return ... this.x }
set r(newR) { ... }
set phi(newPhi) { ... }
}
function% Circlish() {
get area() { return Math.PI*this.radius*this.radius; }
get diameter() { return 2*this.radius; }
get cirumference() { return 2*Math.PI*this.radius; }
}
function% Translatable() {
translate(dx, dy) { this.x += dx; this.y += dy; }
}
function% Rotatable() {
rotate(angle) { this.phi += angle; }
grow(quotient) { this.r *= quotient; }
}
class BasicPoint {
constructor(x, y) {
this.x = x;
this.y = y;
}
Pointish()
}
class Vector extends BasicPoint {
constructor(x, y) { super(x, y); }
Rotatable(),
Translatable()
}
class Circle extends BasicPoint {
constructor(x, y, radius) {
super(x, y);
this.radius = radius;
}
Circlish(),
Translatable(),
grow(quotient) { this.radius *= quotient; }
}
//etc
That's it.
Known problems, open questions.
- What if I want to include a trait to class or sub-data into an object, but do not want to call a macro, which must be evaluated? For performance reasons, there should be some kind of direct import there.
- Do not worry and use parameterless macros. Premature optimization is
the root of all evil. Leave the evil to the compiler. If your macro does not
rely on side-effects, its invoking can be considerably optimized by ECMAScript
itself, down to one
if
from PIC and then inlining it. If you make the macroconst
, even thatif
can be probably eliminated.
Bonus: arrays and generators.
This is just a bonus idea, which sprang up from including code-like features
into data. The array literal was not enhanced any way yet. But it is natural
- in array literal you rarely need to have some elements optional and some not.
So no if
s here. As for the couterpart of a macro,
let's postpone it for a while.
Arrays have "listish", "linear" feel. So does loops. ;-) But loops are not the right addition to the data production - they use variables and are very code-like. But there is another element, which is "listish" and "linear" - a generator. So, if you have defined some generator with
function* fib (upTo) {
...
}
[ "fibonacci", 10, *fib(10) ]
[ *gen(args) ]
form to have
intrinsic toArray
. And generators are essentially macros
of the array world, with a grain of salt.
And I think [ ..., *foo, ...]
syntax could work for any
iterable thing. Why only generators?
Thanks for patience, Herby