treehugger.scala is a library to write Scala source code programmatically. It’s also an implementation of Scala abstract syntax tree based on Reflection API.
libraryDependencies ++= Seq(
"com.eed3si9n" %% "treehugger" % "0.1.3"
)
resolvers ++= Seq(
"sonatype" at "https://oss.sonatype.org/content/groups/public"
)
treehugger lets you treat Scala code as data (case classes), and data as code (treehugger DSL) so you can code while you code.
For example, here’s treehugger DSL for Hello World:
import treehugger.forest._
import definitions._
import treehuggerDSL._
val tree: Tree = Predef_println APPLY LIT("Hello, world!")
println(tree)
println(treeToString(tree))
The above code prints two lines:
Apply(Ident(println),List(Literal(Constant(Hello, world!))))
println("Hello, world!")
The first line shows the structure of the returned tree, and the second line is the Scala source code represented by the tree. Because it is responsible for formatting of the string, your code generating code would be more clear using treehugger.
Let’s omit the import statements, and println statements from here. Here’s how treehugger DSL defines classes:
object sym {
val IntQueue = RootClass.newClass("IntQueue")
val BasicIntQueue = RootClass.newClass("BasicIntQueue")
val buf = BasicIntQueue.newValue("buf")
}
CLASSDEF(sym.IntQueue) withFlags(Flags.ABSTRACT) := BLOCK(
DEF("get", IntClass),
DEF("put") withParams(PARAM("x", IntClass))
)
CLASSDEF(sym.BasicIntQueue) withParents(sym.IntQueue) := BLOCK(
VAL(buf) withFlags(Flags.PRIVATE) :=
NEW(ArrayBufferClass TYPE_OF IntClass),
DEF("get", IntClass) := REF(sym.buf) DOT "remove" APPLY(),
DEF("put") withParams(PARAM("x", IntClass)) := BLOCK(
REF(buf) INFIX("+=") APPLY REF("x")
)
)
These print as:
abstract class IntQueue {
def get(): Int
def put(x: Int)
}
class BasicIntQueue extends IntQueue {
private val buf = new scala.collection.mutable.ArrayBuffer[Int]
def get: Int = buf.remove()
def put(x: Int) {
buf += x
}
}
Notice some of the symbols were defined upfront so we could avoid typing in string names.
To be clear where the credit is due, the majority of treehugger’s code is borrowed from the Scala compiler (scalac) written by Martin Odersky, Paul Phillips, and others. treehugger just took it further to fit its needs.
Let’s go over the basic compiler theory, so we have some kind of foundation even if it’s overly simplified. A compiler is a program that translates a source code in a language to another, often machine code such as Java bytecode.
The compilers have been written so many times that there are names to each phases of a typical compiler.
Scanning phase, or Lexical Analysis, is responsible for throwing out the white space and comments, and recognizing literals, identifiers, and keywords as tokens. At this point, there is no structural check, except for making sure that the comments and string quotations match up.
Parsing phase, or Syntactic Analysis, is responsible for constructing an abstract syntax tree (AST) from the linear sequence of tokens according to the rules of the grammar of the language.
Typing phase, or Semantic Analysis, is responsible for adding the symbol table to the syntax tree, by associating variables and references with their definitions, and performing type checking.
Analysis and Optimization phase are specific to the language implementation. Some of the common optimizations are inlining and dead code elimination.
Finally, during Code Generation phase the target language is generated either from AST or from an intermediate structure.
Because our end goal is to get Scala source code, we reverse the flow of scalac to generate the code starting from AST.
Scala 2.10 adds Reflection API, which allows the user peak into AST for a given code. It also includes code that turns AST into Scala source code. All we need now is to generate an AST.
As part of scalac, there’s also a trait called TreeDSL, which can describe AST in code-like way. With both combined seems as though we have everything we need.
Unfortunately that is not the case. Because scalac is focused towards Java bytecode generation, the parser expands syntax sugars such as for loops and infix method applications. In other words, AST contains one of map, flatMap, foreach method in place of a for loop. TreeDSL too is missing a few things, for example, an ability to define classes or objects.
treehugger forks scalac’s code base and extends the AST and TreeDSL so all legal Scala code can be described using the DSL (the only exception is XML literals).
Having the scalac lineage comes with the baggage of having to deal with its data structures.
A Symbol represents an entry in the symbol table, such as values, classes, and type aliases. A compiler would eventually bind all identifiers to a symbol during Typing phase. However, since we are interested in generating the code, the use of symbols are often optional in treehugger.
For example, a unit method declaration may be defined as:
DEF(sym.get)
or:
DEF("get")
Both would yield the same source code at the end. There are several places where the use of symbols are recommended. First, use a built-in symbol if there is one available. Second, consider defining a symbol for repeated reference to a class or a method.
A new symbol may be defined off of an existing symbol as follows:
object sym {
val BasicIntQueue = RootClass.newClass("BasicIntQueue")
val buf = BasicIntQueue.newValue("buf")
val A = ArrowAssocClass.newTypeParameter("A")
val arrow = ArrowAssocClass.newMethod("->")
val B = arrow.newTypeParameter("B")
val T = BasicIntQueue.newAliasType("T")
}
Defining symbols for every identifiers would double the size of the code, and would likely make the experience of writing it cumbersome.
A Type represents a Scala type. Symbols of class ClassSymbols and TypeSymbols can automatically be promoted to a Type and many of the built-in symbols represent built-in classes.
VAL("foo", IntClass)
In the above code, IntClass is a symbol, but it is automatically promoted to a Type. Similarly, a String can also be promoted to a Type.
VAL("foo", "Int")
Types can also be created using type-level expressions and built-in type constructors:
TYPE_ARRAY(StringClass)
TYPE_REF(REF("board") DOT "Coord")
A Tree represents a node in a Scala AST. It could be a simple expression such as a literal, or a combination of other trees.
An expression in treehugger DSL eventually evaluates to a Tree value.
treehugger DSL is an expanded version of TreeDSL in scalac to build Scala AST in code-like fashion. It is able to write all legal Scala expressions except for XML literals.
There are several conventions used throughout this document.
VAL(sym|"x")
In the above, sym|"x" denotes that either a Symbol or a String is accepted as a parameter to VAL(...).
TYPE_LIST(typ)
In the above, typ indicates that TYPE_LIST(...) accepts a Type.
DEF("x"|sym) withTypeParams(TYPEVAR(...), ...)
In the above, TYPEVAR(...), ... denotes that withTypeParams(...) can accept either a vararg list or an Iterable of TYPEVAR(...).
(DEF("get"|sym, [typ])
[withParams(PARAM("x"|sym, typ|"C")[ := arg], ...)]*
[withTypeParams(TYPEVAR(...), ...)]).tree
In the above, [typ] indicates that typ is optional; and [withParams(PARAM("x"|sym, typ|"C")[ := arg], ...)]* indicates that the withParams clause is optional and may be repeated.
import treehugger.forest._
import definitions._
import treehuggerDSL._
val tree: Tree = Predef_println APPLY LIT("Hello, world!")
println(treeToString(tree))
The entire treehugger system is bundled up as treehugger.Forest class. The package object for treehugger defines an instance of Forest called forest for convenience. Under the forest, definitions object defines built-in symbols and treehuggerDSL object defines the DSL.
In the above code, object sym defines optional symbols. By wrapping in sym we can avoid conflicting with the real println function. Then, the line defining val tree: Tree is an example of treehugger DSL.
Finally, forest defines treeToString method to convert AST into a String:
def treeToString(args: Any*): String
treesToString takes a vararg of Any, and pretty prints Tree as Scala source code and everything else using toString with a new line in between.
We will begin looking into treehugger DSL from the literals and comments since they are the simplest forms of Tree. Also having these allows us to gradually intoroduce more complex forms off of others.
Literals are the basic foundation of treehugger DSL. Numeric literals, String, and symbols are written by wrapping a Scala literal with LIT():
LIT(1) // Int
LIT(1L) // Long
LIT(1.23) // Double
LIT(1.23F) // Float
LIT('H') // Char
LIT("H") // String
LIT('Sym) // scala.Symbol
Boolean literals, (), and null are written as follows:
TRUE // true
FALSE // false
NULL // null
UNIT // ()
Single line comments are added to an arbitrary tree as follows:
tree withComment("comments here", ...)
For example,
LIT(2) withComment("comments are useful",
"only if they provide more info than the code")
print as
// comments are useful
// only if they provide more info than the code
2
A definition, such as a function defnition, introduces names that are bound to some expression, block, or a type.
def foo: Int = 1
On the other hand, an abstract declaration introduces only the names and their types. It can be a part of an abstract class definition.
def foo: Int
Abstract value declarations are written by wrapping the name and the type with VAL(...):
(VAL("foo", IntClass): Tree)
or in general:
VAL(sym|"x", typ|"C").tree
where sym is a symbol and typ is a type of the value. In the above code, sym|"x" denotes that either a symbol or String is accepted as the name of the value, and type|"C" denotes that either a type or String is accepted as the type of the value.
Calling tree method creates a Tree without a right-hand side (rhs) expression. Because there’s an implicit conversion, forcing VAL(...) as a Tree automatically calls tree method.
Value definitions are written by appending right-hand side tree after := as follows:
VAL("foo", IntClass) := LIT(0)
This prints as:
val foo: Int = 0
Like Scala, the type annotation can be omitted when rhs is provided:
VAL("foo") := LIT(0)
In addition, a symbol could be used to define a value instead of using String names.
object sym {
val foo = RootClass.newValue("foo")
}
VAL(sym.foo) := LIT(0)
For a larger code base, using symbols makes the code more readable.
The general form of constant value definitions are:
VAL(sym|"x", [typ]) := rhs
Lazy value declarations are written using LAZYVAL(...):
(LAZYVAL("foo", IntClass): Tree)
and lazy value definitions are written as:
LAZYVAL("foo", IntClass) := LIT(0)
Final value definitions are written by appending withFlags(...) after VAL(...):
VAL("foo", IntClass) withFlags(Flags.FINAL) := LIT(0)
There is another form of value definitions called pattern values, but it will be discussed later.
Abstract variable declarations are written using VAR(...):
(VAR("foo", IntClass): Tree)
or in general:
VAR(sym|"x", typ|"C").tree
Value definitions are written as:
VAR("foo", IntClass) := LIT(0)
as a special form of initializing the variable with the default value of the type, WILDCARD can be used as follows:
VAR("foo", IntClass) := WILDCARD
This prints as:
var foo: Int = _
Abstract type declarations are written using TYPEVAR(sym|"T"). Optionally, lower bounds and higher bounds can be specified:
(TYPEVAR("T"): Tree)
(TYPEVAR("T") LOWER(IntClass): Tree)
(TYPEVAR(sym.T) UPPER(ComparableClass TYPE_OF sym.T): Tree)
or in general:
(TYPEVAR(sym|"T") [LOWER(lo|"C")] [UPPER(hi|"D")]).tree
where lo denotes an optional lower bound type and hi upper bound type.
Type alias definitions are written by appending := and right-hand side Type. The alias may accompany type parameters given by withTypeParams(TYPEVAR(...), ...):
TYPEVAR("IntList") := TYPE_LIST(IntClass)
TYPEVAR("Two") withTypeParams(TYPEVAR(sym.A)) := TYPE_TUPLE(sym.A, sym.A)
or in general:
TYPEVAR(sym|"T") [withTypeParams(TYPEVAR(typ1), ...)] := typ2
Variance annotations for the type parameters are set using COVARIANT(...) and CONTRAVARIANT(...):
val A = RootClass.newTypeParameter("A")
(TYPEVAR("M") withTypeParams(TYPEVAR(COVARIANT(A))): Tree)
(TYPEVAR("M") withTypeParams(TYPEVAR(CONTRAVARIANT(A))): Tree)
The above examples print as:
type M[+A]
type M[-A]
Abstract function declarations are written using DEF(...). Optionally, parameter lists can be specified using withParams(PARAM(...), ...) and type parameters can be specified using withTypeParams(TYPEVAR(...), ...):
(DEF("get", IntClass): Tree)
(DEF("put") withParams(PARAM("x", IntClass)): Tree)
(DEF("compare", BooleanClass)
withTypeParams(TYPEVAR(sym.T))
withParams(PARAM("a", sym.T) := LIT(0))
withParams(PARAM("b", sym.T) := LIT(0)): Tree)
The above examples print as:
def get: Int
def put(x: Int)
def compare[T](a: T = 0)(b: T = 0): Boolean
In genral:
(DEF("get"|sym, [typ])
[withParams(PARAM("x"|sym, typ|"C")[ := arg], ...)]*
[withTypeParams(TYPEVAR(...), ...)]).tree
Function definitions are written by appending right-hand side tree after := as follows:
DEF("get", IntClass) := LIT(0)
This prints as:
def get: Int = 0
Like Scala, the result type can be omitted as follows:
DEF("get") := LIT(0)
Procedure definitions are written by omitting the result type and using BLOCK(tree, ...) for the rhs:
DEF("write") withParams(PARAM("str", StringClass)) := BLOCK(
LIT(0)
)
This prints as:
def write(str: String) {
0
}
Note withParams(...) clause may be added multiple times, each forming a parameter list:
(DEF("compare")
withTypeParams(TYPEVAR(sym.T))
withParams(PARAM("a", sym.T))
withParams(PARAM("b", sym.T))) := FALSE
Similar to VAL(...), PARAM(...) can be followed by := and rhs to specify the default argument:
(DEF("compare")
withTypeParams(TYPEVAR(sym.T))
withParams(PARAM("a", sym.T) := LIT(0))
withParams(PARAM("b", sym.T) := LIT(0))) := FALSE
By-name parameters are written using TYPE_BYNAME(typ) as follows:
(DEF("whileLoop")
withParams(PARAM("cond", TYPE_BYNAME(BooleanClass)))
withParams(PARAM("stat", TYPE_BYNAME(UnitClass))): Tree)
This prints as:
def whileLoop(cond: => Boolean)(stat: => Unit)
Repeated parameters are written using STAR(typ) as follows:
(DEF("sum", IntClass)
withParams(PARAM("args", STAR(IntClass))): Tree)
This prints as:
def sum(args: Int*): Int
Import clauses are written using IMPORT(...). Optionally, import selectors may be specified:
IMPORT(MutablePackage)
IMPORT("scala.collection.mutable")
IMPORT(MutablePackage, "_")
IMPORT(MutablePackage, "Map", "Set")
IMPORT(MutablePackage, RENAME("Map") ==> "MutableMap")
The above examples print as:
import scala.collection.mutable
import scala.collection.mutable
import scala.collection.mutable._
import scala.collection.mutable.{Map, Set}
import scala.collection.mutable.{Map => MutableMap}
In general:
IMPORT(sym|"x.y.z", ["X" | RENAME("X") ==> "Y"]*)
The only odd thing is ==> operator used for RENAME(...). Because => is already taken by Scala, treehugger DSL uses ==> instead.
Classes and objects are both defined in terms of templates, whose body is represented using BLOCK(...) in treehugger DSL.
In general, treehugger DSL uses BLOCK(...) wherever curly braces ({}) appear in Scala. BLOCK(...) accepts vararg of trees, such as class member definitions and expressions:
val IntQueue: ClassSymbol = RootClass.newClass("IntQueue")
CLASSDEF(IntQueue) withFlags(Flags.ABSTRACT) := BLOCK(
DEF("get", IntClass),
DEF("put") withParams(PARAM("x", IntClass))
)
This example prints as:
abstract class IntQueue {
def get: Int
def put(x: Int)
}
Class definitions are written using CLASSDEF(...):
(CLASSDEF("C"): Tree)
CLASSDEF("C") := BLOCK(
VAL("x") := LIT(0)
)
The above examples print as:
class C
class C {
val x = 0
}
The general form of the first example is:
CLASSDEF(sym|"C").empty
As with value and function names, CLASSDEF can accept either a symbol or a String.
Primary constructor parameters are written using withParams(...) similar to function definitions. Except, it could use VAL(...) and VAR(...) in addition to PARAM(...):
CLASSDEF("C")
withParams(PARAM("x", IntClass),
VAL("y", StringClass),
VAR("z", TYPE_LIST(StringClass))) := BLOCK(
DEF("hi") := LIT("hi")
)
This prints as:
class C(x: Int, val y: String, var z: List[String]) {
def hi = "hi"
}
To define classes by extending super classes use withParents(tree|typ|"T"):
CLASSDEF("C") withParents("B") := BLOCK(
DEF("x") := LIT(0)
)
This prints as:
class C extends B {
def x = 0
}
To define self type annotations use withSelf(sym|"self", [typ1, ...]):
CLASSDEF("C") withSelf("self", "T1", "T2") := BLOCK(
VAL("x") := REF("self")
)
This prints as:
class C { self: T1 with T2 =>
val x = self
}
To define field values before supertype constructor is called add early definitions using withEarlyDefs(tree, ...):
CLASSDEF("C") withEarlyDefs(
VAL("name") := LIT("Bob")
) withParents("B") := BLOCK(
LIT(0)
)
This prints as:
class C extends {
val name = "Bob"
} with B {
0
}
Many of the Tree objects has Modifier field, which adds extra attribute about the tree such as access level and mutability. For both classes and their members modifier flags can be given using withFlags(...), which takes a PRIVATEWITHIN or vararg of Long.
To define class members with access modifiers use Flags.PRIVATE, Flags.PROTECTED, and PRIVATEWITHIN("scope"):
CLASSDEF("C") := BLOCK(
DEF("x") withFlags(Flags.PRIVATE) := LIT(0)
DEF("y") withFlags(Flags.PROTECTED) := LIT(0)
DEF("z") withFlags(PRIVATEWITHIN("this")) := LIT(0)
)
This prints as:
class C {
private def x = 0
protected def y = 0
private[this] def z = 0
}
To override class members use Flags.OVERRIDE:
CLASSDEF("C") withParents("B") := BLOCK(
DEF("x") withFlags(Flags.OVERRIDE) := LIT(0)
)
This prints as:
class C extends B {
override def x = 0
}
To prohibit overriding by subclasses class members are marked final using Flags.FINAL:
CLASSDEF("C") := BLOCK(
DEF("x") withFlags(Flags.FINAL) := LIT(0)
)
This prints as:
class C {
final def x = 0
}
To define abstract classes use Flags.ABSTRACT on CLASSDEF(...):
val IntQueue: ClassSymbol = RootClass.newClass("IntQueue")
CLASSDEF(IntQueue) withFlags(Flags.ABSTRACT) := BLOCK(
DEF("get", IntClass)
DEF("put") withParams(PARAM("x", IntClass))
)
This prints as:
abstract class IntQueue {
def get: Int
def put(x: Int)
}
To define final classes, which prohibits extension, use Flags.FINAL on CLASSDEF(...):
(CLASSDEF("C") withFlags(Flags.FINAL): Tree)
This prints as:
final class C
To define sealed classes use Flags.SEALED on CLASSDEF(...):
CLASSDEF("Animal") withFlags(Flags.ABSTRACT, Flags.SEALED)
This prints as:
sealed abstract class Animal
To define private constructors use Flags.PRIVATE with withCtorFlags(...) on CLASSDEF(...):
(CLASSDEF("C") withCtorFlags(Flags.PRIVATE)
withParams(PARAM("x", IntClass)): Tree)
This prints as:
class C private (x: Int)
Polymorphic classes are defined using withTypeParams(...):
(CLASSDEF("Queue") withTypeParams(TYPEVAR("T"))
withParams(VAL("leading", "T"), VAL("trailing", "T")): Tree)
This prints as:
class Queue[T](val leading: List[T], val trailing: List[T])
Case classes are defined using CASECLASSDEF(...):
object sym {
val Expr = RootClass.newClass("Expr")
val Var = RootClass.newClass("Var")
}
(CLASSDEF(sym.Expr) withFlags(Flags.SEALED, Flags.ABSTRACT): Tree)
(CASECLASSDEF(sym.Var)
withParams(VAL("name", StringClass)) withParents(Expr): Tree)
These print as:
sealed abtract class Expr
case class Var(name: String) extends Expr
Traits are defined using TRAITDEF(...):
TRAITDEF("Philosophical") := BLOCK(
DEF("philosophize") := BLOCK(
LIT(0)
)
)
(CLASSDEF("Animal"): Tree)
(TRAITDEF("HasLegs"): Tree)
CLASSDEF("Frog")
withParents("Animal", "HasLegs", "Philosophical") := BLOCK(
DEF(Any_toString) withFlags(Flags.OVERRIDE) := LIT("green")
)
These print as:
trait Philosophical {
def philosophize {
0
}
}
class Animal
trait HasLegs
class Fogs extends Animal with HasLegs with Philosophical {
override toString = "green"
}
Object definitions are written using OBJECTDEF(...):
OBJECTDEF("Main") withParents("App") := BLOCK(
LIT(0)
)
This prints as:
object Main extends App {
0
}
Case object definitions are written using CASEOBJECTDEF(...):
(CASEOBJECTDEF("C"): Tree) // case object C
Now that we’ve covered classes and objects, we should look into expressions in depth.
As we’ve seen earlier, literals are written as follows:
LIT(1) // 1
LIT(1L) // 1L
LIT(1.23) // 1.23
LIT(1.23F) // 1.23F
LIT('H') // 'H'
LIT("H") // "H"
LIT('Sym) // 'Sym
TRUE // true
FALSE // false
NULL // null
UNIT // ()
Simple names are written using REF(sym|"x") to refer to values and methods that immediately available in the current scope:
object sym {
val x = RootClass.newValue("x")
val y = RootClass.newValue("y")
}
REF("x") // x
REF(sym.x) // x
To refer to other values and methods, selections are written by calling DOT(sym|"y") either on a symbol or on a REF(sym|"x"). This returns an intermediate structure that can turn into a Tree by calling tree method or by implicit conversion:
(sym.x DOT sym.y).Tree // x.y
(sym.x DOT "y": Tree) // x.y
(REF("x") DOT "y": Tree) // x.y
References to this are written using THIS or THIS(sym|"C"):
THIS // this
THIS(sym.Address) // Address.this
References to super are written using SUPER or SUPER(sym|"C"). This also returns an intermediate structure that can turn into a Tree by calling tree method or via implicit conversion:
(SUPER: Tree) // super
(SUPER("C"): Tree) // C.super
To add type parameter to super, call APPLYTYPE(sym|"T"):
SUPER APPLYTYPE "T" // super[T]
There are two ways to write function applications. The first way is calling APPLY(arg, ...) on a symbol or a tree. Here, arg denotes a Tree:
Predef_println APPLY LIT("Hello, world!")
(REF("x") DOT "y") APPLY (LIT(0), LIT(1))
These print as:
println("Hello, world!")
x.y(0, 1)
The second way is to apply arg, ... on intermediate structure returned by DOT(sym|"y"):
(REF("x") DOT "y")(LIT(0), LIT(1))
This prints as:
x.y(0, 1)
To pass sequence into a vararg parameter, use SEQARG(arg):
THIS APPLY (SEQARG(REF("list")))
This prints as:
this((list: _*))
To pass named arguments into a function, specify the parameter using REF(sym|"x") as follows:
REF("put") APPLY (REF("x") := LIT(0))
This prints as:
put(x = 0)
Partially applied functions are written by calling APPLY with PARTIALLY:
REF("put") APPLY PARTIALLY
This prints as:
put _
Note this is different from APPLYing WILDCARD since PARTIALLY applies to the entire parameter list.
Type applications are written by calling APPLYTYPE(typ|"C", ...) on a symbol or a tree:
REF("put") APPLYTYPE(IntClass) APPLY(LIT(0))
This prints as:
put[Int](0)
There are three ways to write tuple expressions. The most general form is TUPLE(tree, ...):
TUPLE() // ()
TUPLE(REF("x")) // (x)
TUPLE(LIT(0), LIT(1)) // (0, 1)
The second way is to use UNIT literal:
UNIT // ()
Finally, PAREN(tree, ...) can also be used to write a tuple expression:
PAREN(REF("x")) // (x)
Semantically speaking the actual scala.Tuplen are formed only when two or more arguments are passed, but as a syntactic expression, PAREN is just an alias to TUPLE.
Instance creations are written using NEW(path|typ|"C"):
NEW(sym.T) // new T
NEW("C") // new C
NEW(REF("B") DOT "C") // new B.C
Optionally, arguments may be passed into the constructor using NEW(path|typ|"C", arg, ...):
NEW("C", LIT(0), LIT(1)) // new C(0, 1)
Anonymous classes are created by passing ANONDEF(parent|"C", ...) into NEW(...):
NEW(ANONDEF() := BLOCK(
DEF("name") := LIT("Robert")
))
NEW(ANONDEF("C") := BLOCK(
DEF("name") := LIT("Robert")
))
NEW(ANONDEF("C") withEarlyDefs(
VAL("name") := LIT("Robert")
))
These examples print as:
new {
def name = "Robert"
}
new C {
def name = "Robert"
}
new {
val name = "Robert"
} with C
Blocks are written using BLOCK(tree, ...):
BLOCK(
VAL("x") := LIT(0),
REF("x")
)
This prints as:
{
val x = 1
x
}
Expressions can be constructed from operands and operators.
Prefix operations are written using one of PLUS(tree), MINUS(tree), NOT(tree), or TILDE(tree):
PLUS(LIT(1)) // +(1)
MINUS(LIT(1)) // -(1)
NOT(FALSE) // !(false)
TILDE(LIT(1)) // ~(1)
Postfix operations are written by calling POSFIX(sym|"x") on a tree or a symbol:
LIT(1) POSTFIX(Any_toString) // 1 toString
Infix operations are written using INFIX(sym|"op") as follows:
LIT(1) INFIX("+") APPLY(LIT(2))
This prints as:
1 + 2
Alternatively, INFIX(sym|"op", arg, ...) can be called with righ-hand side arguments:
LIT(1) INFIX("+", LIT(2)) // 1 + 2
Finally, for commonly used operations treehugger DSL defines some built-in operators, which will be covered later:
LIT(1) INT_+ LIT(2) // 1 + 2
Typed expressions are written using withType(typ|"C"):
LIT(0) withType(LongClass) // (0: Long)
Annotated expressions are written using withAnnots(annot, ...):
REF("e") withAnnots(ANNOT(UncheckedClass))
This prints as:
(e: @unchecked)
Annotations are covered later in details.
Assignments are written using :=:
REF("x") := LIT(0) // x = 0
If expressions are written using IF(...) THEN(...) ELSE(...)|ENDIF as follows:
(IF (REF("x") ANY_== REF("y")) THEN REF("x")
ELSE LIT(0))
IF (REF("sunny")) THEN (Predef_println APPLY LIT("Hi!")) ENDIF
These examples print as:
if (x == y) x
else 0
if (sunny) println("Hi!")
While loops are written using WHILE(...) DO BLOCK(...):
WHILE(TRUE) DO BLOCK(
Predef_println APPLY LIT("Hello")
)
This prints as:
```scala while (true) { println(“Hello”) }
Do loops are written by calling DO_WHILE(...) on a block:
BLOCK(
Predef_println APPLY LIT("Hello")
) DO_WHILE(TRUE)
This prints as:
do {
println("Hello")
} while (true)
For expressions are written using FOR(enum, ...) DO tree as follows:
FOR(VALFROM("i") := LIT(0) INT_TO LIT(2)) DO (
Predef_println APPLY LIT("Hello"))
FOR(
VALFROM("i") := LIT(0) INT_TO LIT(2),
IF(REF("x") INT_< LIT(10))
) DO (Predef_println APPLY LIT("Hello"))
FOR(
VALFROM("i") := LIT(0) INT_TO LIT(2),
VAL("x") := REF("i")
) DO BLOCK(
Predef_println APPLY LIT("Hello")
)
These examples print as:
for (i <- 1 to 2)
println("Hello")
for {
i <- 0 to 2
if x < 10
} println("Hello")
for {
i <- 0 to 2
x = i
} {
println("Hello")
}
FOR(...) takes vararg of enumerators which are constructed using VALFROM(sym|"x") := tree, IF(tree), or VAL(sym|"x") := tree.
For comprehensions are written using FOR(enum, ...) YEILD tree:
FOR(VALFROM("i") := LIT(0) INT_TO LIT(2)) YIELD (
REF("i"))
This prints as:
for (i <- 0 to 2)
yield i
Return exressions are written using RETURN(tree):
RETURN(LIT(0)) // return 0
Exceptions are thrown using THROW(sym|c, [tree|"message"]):
THROW(IllegalArgumentExceptionClass)
THROW(IllegalArgumentExceptionClass, "oh no")
THROW(IllegalArgumentExceptionClass, REF("x"))
These examples print as:
throw new IllegalArgumentException()
throw new IllegalArgumentException("oh no")
throw new IllegalArgumentException(x.toString)
Exceptions are caught using TRY(stat, ...) CATCH(CASE(pat), ...) [ENDTRY|FINALLY(...)]. CASE(...) accepts a pattern matching expression, which is shown briefly here, and covered in detail later:
(TRY (REF("something") APPLY LIT(0))
CATCH (
CASE(WILDCARD) ==> (Predef_println APPLY LIT("error"))
) ENDTRY)
(TRY (REF("something") APPLY LIT(0))
CATCH (
CASE(WILDCARD) ==> (Predef_println APPLY LIT("error"))
) FINALLY(Predef_println APPLY LIT("finally")))
In the above examples, WILDCARD is a pattern expression that matches to anything. The examples print as:
try {
something(0)
} catch {
case _ => println("error")
}
try {
something(0)
} catch {
case _ => println("error")
} finally println("finally")
Anonymous functions are written using LAMBDA(PARAM(...), ...) ===> rhs:
LAMBDA(PARAM("x")) ==> (REF("x") INT_+ LIT(1))
LAMBDA(PARAM("x", IntClass)) ==> BLOCK(
REF("x") INT_+ LIT(1))
These examples print as:
x => x + 1
{ (x: Int) =>
x + 1
}
As the second example shows, when the right-hand side tree is a block, the entire anonymous function is wrapped in a block, which makes it convenient to be used with higher-order functions.
The parameter list of anonymous functions may be WILDCARD:
LAMBDA(PARAM(WILDCARD)) ==> (REF("x") INT_+ LIT(1))
This prints as:
_ => x + 1
In addition, an anonymous functions are formed when an expression contains WILDCARD.
WILDCARD INT_+ WILDCARD // _ + _
Use TYPE_REF(tree|sym|"C") to explicitly convert symbols, names, and trees into types:
(VAL("pos", TYPE_REF(REF("board") DOT "Coord")): Tree)
This prints as:
val pos: board.Coord
Applied types are written by calling TYPE_OF(typ|"C", ...) on a type:
REF("x") withType(ListClass TYPE_OF IntClass)
This prints as:
(x: List[Int])
treehugger DSL provides built-in type constructors, which will be covered more later:
REF("x") withType(TYPE_LIST(IntClass))
REF("y") withType(TYPE_TUPLE(IntClass, IntClass))
REF("z") withType(IntClass TYPE_=> IntClass)
These examples print as:
(x: List[Int])
(y: (Int, Int))
(z: Int => Int)
Refined types are written by calling TYPE_WITH (typ|"C", ...) on a type:
(VAL("x", TYPE_REF("A") TYPE_WITH "B"): Tree)
This prints as:
val x: A with B
Singleton type are written using TYPE_SINGLETON(tree):
(VAL("x", TYPE_SINGLETON(THIS)): Tree)
This prints as:
val x: this.type
Structural types are written using TYPE_STRUCT(tree, ...):
REF("x") withType(TYPE_STRUCT(
DEF("close", UnitClass)
))
This prints as:
(x: ({ def close: Unit }))
Type projections are written by calling TYPE_# (typ|"C") on a type:
REF("x") withType(TYPE_STRUCT(
TYPEVAR("L") withTypeParams(TYPEVAR("A")) :=
REF("Const") APPLYTYPE ("M", "A")
) TYPE_#("L"))
This prints as:
(foo: ({ type L[A] = Const[M, A] })#L)
Existential types are written by calling TYPE_FORSOME(tree, ...) on a type:
(DEF("foo")
withParams(PARAM("arg", TYPE_LIST(
TYPE_REF(REF("x") DOT "T")) TYPE_FORSOME(
VAL("x", "Outer")
)))).tree
This prints as:
def foo(arg: List[x.T] forSome { val x: Outer })
Implicit values are written using withFlags(Flags.IMPLICIT):
(DEF("intToRational") withFlags(Flags.IMPLICIT)
withParams(PARAM("x", IntClass)) := NEW("Rational", REF("x")))
This print as:
implicit def intToRational(x: Int) = new Rational(x)
Implicit parameters are also written using withFlags(Flags.IMPLICIT):
(DEF("greet")
withParams(PARAM("name", StringClass))
withParams(PARAM("config", "Config")
withFlags(Flags.IMPLICIT)) := BLOCK(
Predef_println APPLY(REF("config") APPLY REF("name"))
))
This prints as:
def greet(name: String)(implicit config: Config) {
println(config(name))
}
View bounds are written by calling VIEWBOUNDS(typ|"T") on TYPEVAR(...):
(DEF("maxList", "T")
withTypeParams(TYPEVAR("T") VIEWBOUNDS TYPE_ORDERED("T"))
withParams(PARAM("elements", TYPE_LIST("T"))): Tree)
This prints as:
def maxList[T <% Ordered[T]](elements: List[T]): T
Context bounds are written by calling CONTEXTBOUNDS(typ|"T") on TYPEVAR(...):
(DEF("put", UnitClass)
withTypeParams(TYPEVAR(sym.A) CONTEXTBOUNDS FullManifestClass)
withParams(PARAM("x", sym.A)): Tree)
This prints as:
def put[A : Manifest](x: A): Unit
Pattern matching is written using tree MATCH(CASE(...), ...):
DEF("maxList", "T")
withTypeParams(TYPEVAR("T") VIEWBOUNDS TYPE_ORDERED("T"))
withParams(PARAM("elements", TYPE_LIST("T"))) :=
REF("elements") MATCH (
CASE(ListClass UNAPPLY()) ==>
THROW(IllegalArgumentExceptionClass, "empty list!"),
CASE(ListClass UNAPPLY(ID("x"))) ==> REF("x"),
CASE(ID("x") UNLIST_:: ID("rest")) ==> BLOCK(
VAL("maxRest") := REF("maxList") APPLY(REF("rest")),
IF(REF("x") INT_> REF("maxRest")) THEN REF("x")
ELSE REF("maxRest")
)
)
This prints as:
def maxList[T <% Ordered[T]](elements: List[T]): T =
elements match {
case List() =>
throw new IllegalArgumentException("empty list!")
case List(x) => x
case x :: rest => {
val maxRest = maxList(rest)
if (x > maxRest) x
else maxRest
}
}
Let’s look into the pattern expressions.
Variable patterns are written as ID(sym|"x") and WILDCARD. They both match any value:
ID("x") // x
WILDCARD // _
Typed patterns are written as pat withType(typ|"C"):
ID("x") withType(IntClass) // (x: Int)
WILDCARD withType(IntClass) // (_: Int)
Pattern binders are written as pat withBinder(sym|"x"). Binders are used to give names to patterns:
WILDCARD withBinder("x") // (x @ _)
Literal patterns are written using LIT(...):
LIT("X") // "X"
Stable identifier patterns are written using BACKQUOTED(sym|"x"):
BACKQUOTED("x") // `x`
Constructor patterns are written by calling UNAPPLY(pattern, ...) to a symbol or a tree:
REF("Address") UNAPPLY(WILDCARD, WILDCARD, WILDCARD)
This prints as:
Address(_, _, _)
Tuple patterns are written using TUPLE(pattern, ...):
TUPLE(LIT(0), LIT(1)) // (0, 1)
Sequence wildcards are written using SEQ_WILDCARD withBinder(sym|"x"):
REF("C") UNAPPLY(SEQ_WILDCARD withBinder("xs"))
This prints as:
C((xs @ _*))
Infix operations patterns are written by calling INFIX(op) UNAPPLY(pat, ...) to a symbol or a tree:
LIT(0) INFIX(ConsClass) UNAPPLY (NIL)
This prints as:
0 :: Nil
Because this pattern appears frequently, treehugger DSL provides a built-in constructor UNLIST_:::
LIT(0) UNLIST_:: NIL // 0 :: Nil
This works because an infix operation pattern p op (q, …) is a shorthand for the constructor pattern op(p, q, …).
Pattern alternatives are written as pat1 OR_PATTERN pat2:
LIT("New York") OR_PATTERN LIT("Paris")
This prints as:
"New York" | "Paris"
Pattern matching expressions are written by calling MATCH(CASE(pattern), ...) on a symbol or a tree:
REF("x") MATCH(
CASE (LIT(0) OR_PATTERN LIT(1)) ==> TRUE,
CASE (WILDCARD) ==> FALSE
)
This prints as:
x match {
case 0 | 1 => true
case _ => false
}
Just as we saw in anonymous functions, treehugger DSL uses ==> to denote Scala’s =>.
Optinally, a guard clause may be added to the case clause using IF(...):
REF("x") MATCH(
CASE (ID("x"),
IF(REF("x") INT_< LIT(10))) ==> TRUE,
CASE (WILDCARD) ==> FALSE
)
This prints as:
x match {
case x if x < 10 => true
case _ => false
}
Case sequence functions are defined by listing CASE(...) clauses in a BLOCK(...):
BLOCK(
CASE(SOME(ID("x"))) ==> REF("x"),
CASE(NONE) ==> LIT(0)
)
This prints as:
{
case Some(x) => x
case None => 0
}
Pattern values are defined by placing a pattern in VAL(...) or VAR(...):
VAL(REF("Address") UNAPPLY
(ID("name"), ID("street"), ID("city"))) := REF("x")
VAR(SOME(ID("y"))) := SOME(LIT(1))
These examples print as:
val Address(name, street, city) = x
var Some(y) = Some(1)
Top-level definitions consists of compilation units, packagings, and package objects.
Compilation units are written by calling inPackage(sym|"p") or withoutPackage on BLOCK(...):
BLOCK(
OBJECTDEF("M")
) inPackage("p")
This prints as:
package p
object M
BLOCK(
OBJECTDEF("M1"),
OBJECTDEF("M2")
) withoutPackage
This prints as:
object M1
object M2
Packagings are written using PACKAGE(sym|"p") as follows:
PACKAGE("p") := BLOCK(
OBJECTDEF("M")
)
This prints as:
package p {
object M
}
Package objects are written using PACKAGEOBJECTDEF(...):
PACKAGEOBJECTDEF("p") := BLOCK(
OBJECTDEF("M")
)
This prints as:
package object p {
object M
}
Annotations are applied to definitions, declarations, types, or expressions.
Annotations are applied to declarations using withAnnots(annot, ...). This takes vararg of AnnotationInfo, which is created using ANNOT(typ|"C", arg, ...).
CLASSDEF("C")
withAnnots(ANNOT(SerializableAttr)) := BLOCK(
DEF("get", IntClass) := LIT(0)
)
This prints as:
@serializable class C {
def get: Int = 0
}
Type annotations are written by calling withAnnots(annot, ...) on TYPEVAR(...).
val annot = ANNOT(SpecializedClass, REF(IntClass))
TRAITDEF("Function0")
withTypeParams(TYPEVAR("T") withAnnots(annot)) := BLOCK(
DEF("apply", "T")
)
This prints as:
trait Function0[@specialized(Int) T] {
def apply: T
}
Annotated expressions are written as tree withAnnots(annot, ...):
REF("e") withAnnots(ANNOT(UncheckedClass))
This prints as:
(e: @unchecked)
treehugger DSL provides some built-in functions for commonly used functions.
Here are the operators that form infix application tree when called on a symbol or a tree.
lhs OR rhs // ||
lhs AND rhs // &&
lhs ANY_== rhs // ==
lhs ANY_!= rhs // !=
lhs ANY_-> rhs // ->
lhs OBJ_EQ rhs // eq
lhs OBJ_NE rhs // ne
lhs INT_| rhs // |
lhs INT_& rhs // &
lhs INT_< rhs // <
lhs INT_> rhs // >
lhs INT_<= rhs // <=
lhs INT_>= rhs // >=
lhs INT_+ rhs // +
lhs INT_- rhs // -
lhs INT_* rhs // *
lhs INT_/ rhs // /
lhs INT_TO rhs // TO
lhs LIST_:: rhs // ::
lhs LIST_::: rhs // :::
lhs SEQ_++ rhs // ++
lhs SEQ_/: rhs // /:
lhs SEQ_:\ rhs // :\
When the right-hand side expression is an anonymous function block or a case sequence function, the collection operators form an infix application tree. Otherwise, they form a selection or an application tree:
REF("list") MAP LAMBDA(PARAM("x")) ==> BLOCK(
REF("x") INT_+ LIT(1)
)
REF("list") MAP (WILDCARD INT_+ LIT(1))
These print as:
list map { x =>
x + 1
}
foo.map(_ + 1)
Here are the operators:
lhs FOREACH rhs // foreach
lhs MAP rhs // map
lhs FLATMAP rhs // flatMap
lhs COLLECT rhs // collect
lhs FIND rhs // find
lhs SLICE (from, to) // slice
lhs TAKE rhs // take
lhs DROP rhs // drop
lhs TAKEWHILE rhs // take
lhs DROPWHILE rhs // drop
lhs FILTER rhs // filter
lhs WITHFILTER rhs // withFilter
lhs FILTERNOT rhs // filterNot
lhs SPLITAT rhs // splitAt
lhs SPAN rhs // span
lhs PARTITION rhs // partition
lhs GROUPBY rhs // groupBy
lhs FORALL rhs // forall
lhs EXISTS rhs // exists
lhs COUNT rhs // count
lhs FOLDLEFT rhs // foldLeft
lhs FOLDRIGHT rhs // foldRight
lhs REDUCELEFT rhs // reduceLeft
lhs REDUCERIGHT rhs // reduceRight
Here are built-in constructors and values:
LIST(arg, ...) // List
NIL // Nil
pat1 UNLIST_:: pat2 // ::
SOME(arg, ...) // Some
NONE // None
RIGHT(arg) // Right
LEFT(arg) // Left
ARRAY(arg, ...) // Array
SEQ(arg, ...) // Seq
VECTOR(arg, ...) // Vector
MAKE_MAP(k ANY_-> v, ...) // Map
Here are built-in type constructors:
TYPE_ITERATOR(typ) // Iterator[A]
TYPE_TUPLE(typ, ...) // Tuple2[A, B, ...]
TYPE_ARRAY(typ) // Array[A]
TYPE_LIST(typ) // List[A]
TYPE_SEQ(typ) // Seq[A]
TYPE_VECTOR(typ) // Vector[A]
TYPE_MAP(typ1, typ2) // Map[A, B]
TYPE_OPTION(typ) // Option[A]
TYPE_EITHER(typ1, typ2) // Either[A, B]
TYPE_RIGHT(typ1, typ2) // Right[A, B]
TYPE_LEFT(typ1, typ2) // Left[A, B]
TYPE_SOME(typ) // Some[A]
TYPE_ORDERED(typ) // Ordered[A]
typ1 TYPE_=> typ2 // Function1[A, R]
TYPE_FUNCTION(typ, ..., r) // Function1[A, R]
TYPE_=:=(typ1, typ2) // =:=[A, B]
TYPE_<:<(typ1, typ2) // <:<[A, B]
TYPE_<%<(typ1, typ2) // <%<[A, B]