treehugger.scala

treehugger.scala is a library to write Scala source code programmatically. It’s also an implementation of Scala abstract syntax tree based on Reflection API.

Setup

libraryDependencies ++= Seq(
  "com.eed3si9n" %% "treehugger" % "0.1.3"
)

resolvers ++= Seq(
  "sonatype" at "https://oss.sonatype.org/content/groups/public"
)

I heard you like code, so…

treehugger lets you treat Scala code as data (case classes), and data as code (treehugger DSL) so you can code while you code.

For example, here’s treehugger DSL for Hello World:

import treehugger.forest._
import definitions._
import treehuggerDSL._

val tree: Tree = Predef_println APPLY LIT("Hello, world!")

println(tree)
println(treeToString(tree))

The above code prints two lines:

Apply(Ident(println),List(Literal(Constant(Hello, world!))))
println("Hello, world!")

The first line shows the structure of the returned tree, and the second line is the Scala source code represented by the tree. Because it is responsible for formatting of the string, your code generating code would be more clear using treehugger.

Classy example

Let’s omit the import statements, and println statements from here. Here’s how treehugger DSL defines classes:

object sym {
  val IntQueue = RootClass.newClass("IntQueue")
  val BasicIntQueue = RootClass.newClass("BasicIntQueue")
  val buf = BasicIntQueue.newValue("buf")
}

CLASSDEF(sym.IntQueue) withFlags(Flags.ABSTRACT) := BLOCK(
  DEF("get", IntClass),
  DEF("put") withParams(PARAM("x", IntClass))
)

CLASSDEF(sym.BasicIntQueue) withParents(sym.IntQueue) := BLOCK(
  VAL(buf) withFlags(Flags.PRIVATE) :=
    NEW(ArrayBufferClass TYPE_OF IntClass),
  DEF("get", IntClass) := REF(sym.buf) DOT "remove" APPLY(),
  DEF("put") withParams(PARAM("x", IntClass)) := BLOCK(
    REF(buf) INFIX("+=") APPLY REF("x")
  )
)

These print as:

abstract class IntQueue {
  def get(): Int
  def put(x: Int)
}

class BasicIntQueue extends IntQueue {
  private val buf = new scala.collection.mutable.ArrayBuffer[Int]
  def get: Int = buf.remove()
  def put(x: Int) {
    buf += x
  }
}

Notice some of the symbols were defined upfront so we could avoid typing in string names.

Thanks scalac!

To be clear where the credit is due, the majority of treehugger’s code is borrowed from the Scala compiler (scalac) written by Martin Odersky, Paul Phillips, and others. treehugger just took it further to fit its needs.

Compilers 101

Let’s go over the basic compiler theory, so we have some kind of foundation even if it’s overly simplified. A compiler is a program that translates a source code in a language to another, often machine code such as Java bytecode.

The compilers have been written so many times that there are names to each phases of a typical compiler.

Scan

Scanning phase, or Lexical Analysis, is responsible for throwing out the white space and comments, and recognizing literals, identifiers, and keywords as tokens. At this point, there is no structural check, except for making sure that the comments and string quotations match up.

Parse

Parsing phase, or Syntactic Analysis, is responsible for constructing an abstract syntax tree (AST) from the linear sequence of tokens according to the rules of the grammar of the language.

Semantic Analysis

Typing phase, or Semantic Analysis, is responsible for adding the symbol table to the syntax tree, by associating variables and references with their definitions, and performing type checking.

Backend

Analysis and Optimization phase are specific to the language implementation. Some of the common optimizations are inlining and dead code elimination.

Finally, during Code Generation phase the target language is generated either from AST or from an intermediate structure.

Turning the table

Because our end goal is to get Scala source code, we reverse the flow of scalac to generate the code starting from AST.

Scala 2.10 adds Reflection API, which allows the user peak into AST for a given code. It also includes code that turns AST into Scala source code. All we need now is to generate an AST.

As part of scalac, there’s also a trait called TreeDSL, which can describe AST in code-like way. With both combined seems as though we have everything we need.

Unfortunately that is not the case. Because scalac is focused towards Java bytecode generation, the parser expands syntax sugars such as for loops and infix method applications. In other words, AST contains one of map, flatMap, foreach method in place of a for loop. TreeDSL too is missing a few things, for example, an ability to define classes or objects.

treehugger forks scalac’s code base and extends the AST and TreeDSL so all legal Scala code can be described using the DSL (the only exception is XML literals).

Symbols, Types, and Trees

Having the scalac lineage comes with the baggage of having to deal with its data structures.

Symbol

A Symbol represents an entry in the symbol table, such as values, classes, and type aliases. A compiler would eventually bind all identifiers to a symbol during Typing phase. However, since we are interested in generating the code, the use of symbols are often optional in treehugger.

For example, a unit method declaration may be defined as:

DEF(sym.get)

or:

DEF("get")

Both would yield the same source code at the end. There are several places where the use of symbols are recommended. First, use a built-in symbol if there is one available. Second, consider defining a symbol for repeated reference to a class or a method.

A new symbol may be defined off of an existing symbol as follows:

object sym {
  val BasicIntQueue = RootClass.newClass("BasicIntQueue")
  val buf = BasicIntQueue.newValue("buf")  
  val A = ArrowAssocClass.newTypeParameter("A")
  val arrow = ArrowAssocClass.newMethod("->")
  val B = arrow.newTypeParameter("B")
  val T = BasicIntQueue.newAliasType("T")
}

Defining symbols for every identifiers would double the size of the code, and would likely make the experience of writing it cumbersome.

Type

A Type represents a Scala type. Symbols of class ClassSymbols and TypeSymbols can automatically be promoted to a Type and many of the built-in symbols represent built-in classes.

VAL("foo", IntClass)

In the above code, IntClass is a symbol, but it is automatically promoted to a Type. Similarly, a String can also be promoted to a Type.

VAL("foo", "Int")

Types can also be created using type-level expressions and built-in type constructors:

TYPE_ARRAY(StringClass)
TYPE_REF(REF("board") DOT "Coord")

Tree

A Tree represents a node in a Scala AST. It could be a simple expression such as a literal, or a combination of other trees.

An expression in treehugger DSL eventually evaluates to a Tree value.

treehugger DSL

treehugger DSL is an expanded version of TreeDSL in scalac to build Scala AST in code-like fashion. It is able to write all legal Scala expressions except for XML literals.

Conventions used in examples

There are several conventions used throughout this document.

VAL(sym|"x")

In the above, sym|"x" denotes that either a Symbol or a String is accepted as a parameter to VAL(...).

TYPE_LIST(typ)

In the above, typ indicates that TYPE_LIST(...) accepts a Type.

DEF("x"|sym) withTypeParams(TYPEVAR(...), ...)

In the above, TYPEVAR(...), ... denotes that withTypeParams(...) can accept either a vararg list or an Iterable of TYPEVAR(...).

(DEF("get"|sym, [typ])
  [withParams(PARAM("x"|sym, typ|"C")[ := arg], ...)]*
  [withTypeParams(TYPEVAR(...), ...)]).tree

In the above, [typ] indicates that typ is optional; and [withParams(PARAM("x"|sym, typ|"C")[ := arg], ...)]* indicates that the withParams clause is optional and may be repeated.

Forest

import treehugger.forest._
import definitions._
import treehuggerDSL._

val tree: Tree = Predef_println APPLY LIT("Hello, world!")

println(treeToString(tree))

The entire treehugger system is bundled up as treehugger.Forest class. The package object for treehugger defines an instance of Forest called forest for convenience. Under the forest, definitions object defines built-in symbols and treehuggerDSL object defines the DSL.

In the above code, object sym defines optional symbols. By wrapping in sym we can avoid conflicting with the real println function. Then, the line defining val tree: Tree is an example of treehugger DSL.

Finally, forest defines treeToString method to convert AST into a String:

def treeToString(args: Any*): String

treesToString takes a vararg of Any, and pretty prints Tree as Scala source code and everything else using toString with a new line in between.

Literals and Comments

We will begin looking into treehugger DSL from the literals and comments since they are the simplest forms of Tree. Also having these allows us to gradually intoroduce more complex forms off of others.

Literals

Literals are the basic foundation of treehugger DSL. Numeric literals, String, and symbols are written by wrapping a Scala literal with LIT():

LIT(1)     // Int
LIT(1L)    // Long
LIT(1.23)  // Double
LIT(1.23F) // Float
LIT('H')   // Char
LIT("H")   // String
LIT('Sym)  // scala.Symbol

Boolean literals, (), and null are written as follows:

TRUE       // true
FALSE      // false
NULL       // null
UNIT       // ()

Comments

Single line comments are added to an arbitrary tree as follows:

tree withComment("comments here", ...)

For example,

LIT(2) withComment("comments are useful",
  "only if they provide more info than the code")

print as

// comments are useful
// only if they provide more info than the code
2

Basic Declarations and Definitions

A definition, such as a function defnition, introduces names that are bound to some expression, block, or a type.

def foo: Int = 1

On the other hand, an abstract declaration introduces only the names and their types. It can be a part of an abstract class definition.

def foo: Int

Values

Abstract value declarations

Abstract value declarations are written by wrapping the name and the type with VAL(...):

(VAL("foo", IntClass): Tree)

or in general:

VAL(sym|"x", typ|"C").tree

where sym is a symbol and typ is a type of the value. In the above code, sym|"x" denotes that either a symbol or String is accepted as the name of the value, and type|"C" denotes that either a type or String is accepted as the type of the value.

Calling tree method creates a Tree without a right-hand side (rhs) expression. Because there’s an implicit conversion, forcing VAL(...) as a Tree automatically calls tree method.

Value definitions

Value definitions are written by appending right-hand side tree after := as follows:

VAL("foo", IntClass) := LIT(0)

This prints as:

val foo: Int = 0

Like Scala, the type annotation can be omitted when rhs is provided:

VAL("foo") := LIT(0)

In addition, a symbol could be used to define a value instead of using String names.

object sym {
  val foo = RootClass.newValue("foo")
}

VAL(sym.foo) := LIT(0)

For a larger code base, using symbols makes the code more readable.

The general form of constant value definitions are:

VAL(sym|"x", [typ]) := rhs

Lazy values

Lazy value declarations are written using LAZYVAL(...):

(LAZYVAL("foo", IntClass): Tree)

and lazy value definitions are written as:

LAZYVAL("foo", IntClass) := LIT(0)

Final values

Final value definitions are written by appending withFlags(...) after VAL(...):

VAL("foo", IntClass) withFlags(Flags.FINAL) := LIT(0)

Pattern values

There is another form of value definitions called pattern values, but it will be discussed later.

Variables

Abstract variable declarations

Abstract variable declarations are written using VAR(...):

(VAR("foo", IntClass): Tree)

or in general:

VAR(sym|"x", typ|"C").tree

Variable definitions

Value definitions are written as:

VAR("foo", IntClass) := LIT(0)

as a special form of initializing the variable with the default value of the type, WILDCARD can be used as follows:

VAR("foo", IntClass) := WILDCARD

This prints as:

var foo: Int = _

Type members

Abstract type declarations

Abstract type declarations are written using TYPEVAR(sym|"T"). Optionally, lower bounds and higher bounds can be specified:

(TYPEVAR("T"): Tree)
(TYPEVAR("T") LOWER(IntClass): Tree)

(TYPEVAR(sym.T) UPPER(ComparableClass TYPE_OF sym.T): Tree)

or in general:

(TYPEVAR(sym|"T") [LOWER(lo|"C")] [UPPER(hi|"D")]).tree

where lo denotes an optional lower bound type and hi upper bound type.

Type alias definitions

Type alias definitions are written by appending := and right-hand side Type. The alias may accompany type parameters given by withTypeParams(TYPEVAR(...), ...):

TYPEVAR("IntList") := TYPE_LIST(IntClass)
TYPEVAR("Two") withTypeParams(TYPEVAR(sym.A)) := TYPE_TUPLE(sym.A, sym.A)

or in general:

TYPEVAR(sym|"T") [withTypeParams(TYPEVAR(typ1), ...)] := typ2

Variance annotations

Variance annotations for the type parameters are set using COVARIANT(...) and CONTRAVARIANT(...):

val A = RootClass.newTypeParameter("A")
(TYPEVAR("M") withTypeParams(TYPEVAR(COVARIANT(A))): Tree)
(TYPEVAR("M") withTypeParams(TYPEVAR(CONTRAVARIANT(A))): Tree)

The above examples print as:

type M[+A]
type M[-A]

Functions

Abstract function declarations

Abstract function declarations are written using DEF(...). Optionally, parameter lists can be specified using withParams(PARAM(...), ...) and type parameters can be specified using withTypeParams(TYPEVAR(...), ...):

(DEF("get", IntClass): Tree)
(DEF("put") withParams(PARAM("x", IntClass)): Tree)
(DEF("compare", BooleanClass)
  withTypeParams(TYPEVAR(sym.T))
  withParams(PARAM("a", sym.T) := LIT(0))
  withParams(PARAM("b", sym.T) := LIT(0)): Tree)

The above examples print as:

def get: Int
def put(x: Int)
def compare[T](a: T = 0)(b: T = 0): Boolean

In genral:

(DEF("get"|sym, [typ])
  [withParams(PARAM("x"|sym, typ|"C")[ := arg], ...)]*
  [withTypeParams(TYPEVAR(...), ...)]).tree

Function definition

Function definitions are written by appending right-hand side tree after := as follows:

DEF("get", IntClass) := LIT(0)

This prints as:

def get: Int = 0

Like Scala, the result type can be omitted as follows:

DEF("get") := LIT(0)

Procedures

Procedure definitions are written by omitting the result type and using BLOCK(tree, ...) for the rhs:

DEF("write") withParams(PARAM("str", StringClass)) := BLOCK(
  LIT(0)
)

This prints as:

def write(str: String) {
  0
}

Currying

Note withParams(...) clause may be added multiple times, each forming a parameter list:

(DEF("compare")
  withTypeParams(TYPEVAR(sym.T))
  withParams(PARAM("a", sym.T))
  withParams(PARAM("b", sym.T))) := FALSE

Default parameters

Similar to VAL(...), PARAM(...) can be followed by := and rhs to specify the default argument:

(DEF("compare")
  withTypeParams(TYPEVAR(sym.T))
  withParams(PARAM("a", sym.T) := LIT(0))
  withParams(PARAM("b", sym.T) := LIT(0))) := FALSE

By-name parameters

By-name parameters are written using TYPE_BYNAME(typ) as follows:

(DEF("whileLoop")
  withParams(PARAM("cond", TYPE_BYNAME(BooleanClass)))
  withParams(PARAM("stat", TYPE_BYNAME(UnitClass))): Tree)

This prints as:

def whileLoop(cond: => Boolean)(stat: => Unit)

Repeated parameters

Repeated parameters are written using STAR(typ) as follows:

(DEF("sum", IntClass)
  withParams(PARAM("args", STAR(IntClass))): Tree)

This prints as:

def sum(args: Int*): Int

Import Clauses

Import clauses are written using IMPORT(...). Optionally, import selectors may be specified:

IMPORT(MutablePackage)
IMPORT("scala.collection.mutable")
IMPORT(MutablePackage, "_")
IMPORT(MutablePackage, "Map", "Set")
IMPORT(MutablePackage, RENAME("Map") ==> "MutableMap")

The above examples print as:

import scala.collection.mutable
import scala.collection.mutable
import scala.collection.mutable._
import scala.collection.mutable.{Map, Set}
import scala.collection.mutable.{Map => MutableMap}

In general:

IMPORT(sym|"x.y.z", ["X" | RENAME("X") ==> "Y"]*)

The only odd thing is ==> operator used for RENAME(...). Because => is already taken by Scala, treehugger DSL uses ==> instead.

Classes and Objects

Templates

Classes and objects are both defined in terms of templates, whose body is represented using BLOCK(...) in treehugger DSL.

In general, treehugger DSL uses BLOCK(...) wherever curly braces ({}) appear in Scala. BLOCK(...) accepts vararg of trees, such as class member definitions and expressions:

val IntQueue: ClassSymbol = RootClass.newClass("IntQueue")
 
CLASSDEF(IntQueue) withFlags(Flags.ABSTRACT) := BLOCK(
  DEF("get", IntClass),
  DEF("put") withParams(PARAM("x", IntClass))
)

This example prints as:

abstract class IntQueue {
  def get: Int
  def put(x: Int)
}

Classes

Class definitions

Class definitions are written using CLASSDEF(...):

(CLASSDEF("C"): Tree)
CLASSDEF("C") := BLOCK(
  VAL("x") := LIT(0)
)

The above examples print as:

class C
class C {
  val x = 0
}

The general form of the first example is:

CLASSDEF(sym|"C").empty

As with value and function names, CLASSDEF can accept either a symbol or a String.

Constructor parameters

Primary constructor parameters are written using withParams(...) similar to function definitions. Except, it could use VAL(...) and VAR(...) in addition to PARAM(...):

CLASSDEF("C")
  withParams(PARAM("x", IntClass),
    VAL("y", StringClass),
    VAR("z", TYPE_LIST(StringClass))) := BLOCK(
  DEF("hi") := LIT("hi") 
)

This prints as:

class C(x: Int, val y: String, var z: List[String]) {
  def hi = "hi"
}

Extending classes

To define classes by extending super classes use withParents(tree|typ|"T"):

CLASSDEF("C") withParents("B") := BLOCK(
  DEF("x") := LIT(0)
)

This prints as:

class C extends B {
  def x = 0
}

Self type annotations

To define self type annotations use withSelf(sym|"self", [typ1, ...]):

CLASSDEF("C") withSelf("self", "T1", "T2") := BLOCK(
  VAL("x") := REF("self")
)

This prints as:

class C { self: T1 with T2 => 
  val x = self
}

Early denifitions

To define field values before supertype constructor is called add early definitions using withEarlyDefs(tree, ...):

CLASSDEF("C") withEarlyDefs(
  VAL("name") := LIT("Bob")
) withParents("B") := BLOCK(
  LIT(0)
)

This prints as:

class C extends {
  val name = "Bob"
} with B {
  0
}

Modifiers

Many of the Tree objects has Modifier field, which adds extra attribute about the tree such as access level and mutability. For both classes and their members modifier flags can be given using withFlags(...), which takes a PRIVATEWITHIN or vararg of Long.

Access modifiers

To define class members with access modifiers use Flags.PRIVATE, Flags.PROTECTED, and PRIVATEWITHIN("scope"):

CLASSDEF("C") := BLOCK(
  DEF("x") withFlags(Flags.PRIVATE) := LIT(0)
  DEF("y") withFlags(Flags.PROTECTED) := LIT(0)
  DEF("z") withFlags(PRIVATEWITHIN("this")) := LIT(0) 
)

This prints as:

class C {
  private def x = 0
  protected def y = 0
  private[this] def z = 0
}

Overriding

To override class members use Flags.OVERRIDE:

CLASSDEF("C") withParents("B") := BLOCK(
  DEF("x") withFlags(Flags.OVERRIDE) := LIT(0)
)

This prints as:

class C extends B {
  override def x = 0
}

Final members

To prohibit overriding by subclasses class members are marked final using Flags.FINAL:

CLASSDEF("C") := BLOCK(
  DEF("x") withFlags(Flags.FINAL) := LIT(0)
)

This prints as:

class C {
  final def x = 0
}

Abstract classes

To define abstract classes use Flags.ABSTRACT on CLASSDEF(...):

val IntQueue: ClassSymbol = RootClass.newClass("IntQueue")
 
CLASSDEF(IntQueue) withFlags(Flags.ABSTRACT) := BLOCK(
  DEF("get", IntClass)
  DEF("put") withParams(PARAM("x", IntClass))
)

This prints as:

abstract class IntQueue {
  def get: Int
  def put(x: Int)
}

Final classes

To define final classes, which prohibits extension, use Flags.FINAL on CLASSDEF(...):

(CLASSDEF("C") withFlags(Flags.FINAL): Tree)

This prints as:

final class C

Sealed classes

To define sealed classes use Flags.SEALED on CLASSDEF(...):

CLASSDEF("Animal") withFlags(Flags.ABSTRACT, Flags.SEALED)

This prints as:

sealed abstract class Animal

Private constructors

To define private constructors use Flags.PRIVATE with withCtorFlags(...) on CLASSDEF(...):

(CLASSDEF("C") withCtorFlags(Flags.PRIVATE)
  withParams(PARAM("x", IntClass)): Tree)

This prints as:

class C private (x: Int)

Polymorphic Classes

Polymorphic classes are defined using withTypeParams(...):

(CLASSDEF("Queue") withTypeParams(TYPEVAR("T"))
  withParams(VAL("leading", "T"), VAL("trailing", "T")): Tree)

This prints as:

class Queue[T](val leading: List[T], val trailing: List[T])

Case Classes

Case classes are defined using CASECLASSDEF(...):

object sym {
  val Expr = RootClass.newClass("Expr")
  val Var  = RootClass.newClass("Var")
}

(CLASSDEF(sym.Expr) withFlags(Flags.SEALED, Flags.ABSTRACT): Tree)
(CASECLASSDEF(sym.Var)
  withParams(VAL("name", StringClass)) withParents(Expr): Tree)

These print as:

sealed abtract class Expr
case class Var(name: String) extends Expr

Traits

Traits are defined using TRAITDEF(...):

TRAITDEF("Philosophical") := BLOCK(
  DEF("philosophize") := BLOCK(
    LIT(0)
  ) 
)
(CLASSDEF("Animal"): Tree)
(TRAITDEF("HasLegs"): Tree)
CLASSDEF("Frog")
    withParents("Animal", "HasLegs", "Philosophical") := BLOCK(
  DEF(Any_toString) withFlags(Flags.OVERRIDE) := LIT("green")
)

These print as:

trait Philosophical {
  def philosophize {
    0
  }
}

class Animal

trait HasLegs

class Fogs extends Animal with HasLegs with Philosophical {
  override toString = "green"
}

Object Definitions

Object definitions are written using OBJECTDEF(...):

OBJECTDEF("Main") withParents("App") := BLOCK(
  LIT(0)
)

This prints as:

object Main extends App {
  0
}

Case object definitions

Case object definitions are written using CASEOBJECTDEF(...):

(CASEOBJECTDEF("C"): Tree)  // case object C

Expressions

Now that we’ve covered classes and objects, we should look into expressions in depth.

Basic Expressions

Literals

As we’ve seen earlier, literals are written as follows:

LIT(1)                       // 1
LIT(1L)                      // 1L
LIT(1.23)                    // 1.23
LIT(1.23F)                   // 1.23F
LIT('H')                     // 'H'
LIT("H")                     // "H"
LIT('Sym)                    // 'Sym
TRUE                         // true
FALSE                        // false
NULL                         // null
UNIT                         // ()

Simple Names

Simple names are written using REF(sym|"x") to refer to values and methods that immediately available in the current scope:

object sym {
  val x = RootClass.newValue("x")
  val y = RootClass.newValue("y") 
}

REF("x")                     // x
REF(sym.x)                   // x

Selection

To refer to other values and methods, selections are written by calling DOT(sym|"y") either on a symbol or on a REF(sym|"x"). This returns an intermediate structure that can turn into a Tree by calling tree method or by implicit conversion:

(sym.x DOT sym.y).Tree       // x.y
(sym.x DOT "y": Tree)        // x.y
(REF("x") DOT "y": Tree)     // x.y

This

References to this are written using THIS or THIS(sym|"C"):

THIS                         // this
THIS(sym.Address)            // Address.this

Super

References to super are written using SUPER or SUPER(sym|"C"). This also returns an intermediate structure that can turn into a Tree by calling tree method or via implicit conversion:

(SUPER: Tree)                // super
(SUPER("C"): Tree)           // C.super

To add type parameter to super, call APPLYTYPE(sym|"T"):

SUPER APPLYTYPE "T"          // super[T]

Function Applications

There are two ways to write function applications. The first way is calling APPLY(arg, ...) on a symbol or a tree. Here, arg denotes a Tree:

Predef_println APPLY LIT("Hello, world!")
(REF("x") DOT "y") APPLY (LIT(0), LIT(1))

These print as:

println("Hello, world!")
x.y(0, 1)

Function application on selections

The second way is to apply arg, ... on intermediate structure returned by DOT(sym|"y"):

(REF("x") DOT "y")(LIT(0), LIT(1))

This prints as:

x.y(0, 1)

Sequence arguments

To pass sequence into a vararg parameter, use SEQARG(arg):

THIS APPLY (SEQARG(REF("list")))

This prints as:

this((list: _*))

Named arguments

To pass named arguments into a function, specify the parameter using REF(sym|"x") as follows:

REF("put") APPLY (REF("x") := LIT(0))

This prints as:

put(x = 0)

Partially applied functions

Partially applied functions are written by calling APPLY with PARTIALLY:

REF("put") APPLY PARTIALLY

This prints as:

put _

Note this is different from APPLYing WILDCARD since PARTIALLY applies to the entire parameter list.

Type Applications

Type applications are written by calling APPLYTYPE(typ|"C", ...) on a symbol or a tree:

REF("put") APPLYTYPE(IntClass) APPLY(LIT(0))

This prints as:

put[Int](0)

Tuples and Parentheses

There are three ways to write tuple expressions. The most general form is TUPLE(tree, ...):

TUPLE()                      // ()
TUPLE(REF("x"))              // (x)
TUPLE(LIT(0), LIT(1))        // (0, 1)

The second way is to use UNIT literal:

UNIT                         // ()

Finally, PAREN(tree, ...) can also be used to write a tuple expression:

PAREN(REF("x"))              // (x)

Semantically speaking the actual scala.Tuplen are formed only when two or more arguments are passed, but as a syntactic expression, PAREN is just an alias to TUPLE.

Instance Creation Expressions

Constructor invocations

Instance creations are written using NEW(path|typ|"C"):

NEW(sym.T)                   // new T
NEW("C")                     // new C
NEW(REF("B") DOT "C")        // new B.C

Optionally, arguments may be passed into the constructor using NEW(path|typ|"C", arg, ...):

NEW("C", LIT(0), LIT(1))     // new C(0, 1)

Anonymous classes

Anonymous classes are created by passing ANONDEF(parent|"C", ...) into NEW(...):

NEW(ANONDEF() := BLOCK(
  DEF("name") := LIT("Robert")
))
NEW(ANONDEF("C") := BLOCK(
  DEF("name") := LIT("Robert")
))
NEW(ANONDEF("C") withEarlyDefs(
  VAL("name") := LIT("Robert")
))

These examples print as:

new {
  def name = "Robert"
}
new C {
  def name = "Robert"
}
new {
  val name = "Robert"
} with C

Blocks

Blocks are written using BLOCK(tree, ...):

BLOCK(
  VAL("x") := LIT(0),
  REF("x")
)

This prints as:

{
  val x = 1
  x
}

Prefix, Infix, and Postfix Operations

Expressions can be constructed from operands and operators.

Prefix operations

Prefix operations are written using one of PLUS(tree), MINUS(tree), NOT(tree), or TILDE(tree):

PLUS(LIT(1))                 // +(1)
MINUS(LIT(1))                // -(1)
NOT(FALSE)                   // !(false)
TILDE(LIT(1))                // ~(1)

Postfix operations

Postfix operations are written by calling POSFIX(sym|"x") on a tree or a symbol:

LIT(1) POSTFIX(Any_toString) // 1 toString

Infix operations

Infix operations are written using INFIX(sym|"op") as follows:

LIT(1) INFIX("+") APPLY(LIT(2))

This prints as:

1 + 2

Alternatively, INFIX(sym|"op", arg, ...) can be called with righ-hand side arguments:

LIT(1) INFIX("+", LIT(2))    // 1 + 2

Finally, for commonly used operations treehugger DSL defines some built-in operators, which will be covered later:

LIT(1) INT_+ LIT(2)          // 1 + 2

Types, Annotations, and Assignments

Typed expressions

Typed expressions are written using withType(typ|"C"):

LIT(0) withType(LongClass)   // (0: Long)

Annotated expressions

Annotated expressions are written using withAnnots(annot, ...):

REF("e") withAnnots(ANNOT(UncheckedClass))

This prints as:

(e: @unchecked)

Annotations are covered later in details.

Assignments

Assignments are written using :=:

REF("x") := LIT(0)           // x = 0 

If Expressions

If expressions are written using IF(...) THEN(...) ELSE(...)|ENDIF as follows:

(IF (REF("x") ANY_== REF("y")) THEN REF("x")
ELSE LIT(0))

IF (REF("sunny")) THEN (Predef_println APPLY LIT("Hi!")) ENDIF

These examples print as:

if (x == y) x
else 0

if (sunny) println("Hi!")

While loops

While loops are written using WHILE(...) DO BLOCK(...):

WHILE(TRUE) DO BLOCK(
  Predef_println APPLY LIT("Hello")
)

This prints as:

```scala while (true) { println(“Hello”) }

Do loops

Do loops are written by calling DO_WHILE(...) on a block:

BLOCK(
  Predef_println APPLY LIT("Hello")
) DO_WHILE(TRUE)

This prints as:

do {
  println("Hello")
} while (true)

For expressions

For expressions are written using FOR(enum, ...) DO tree as follows:

FOR(VALFROM("i") := LIT(0) INT_TO LIT(2)) DO (
  Predef_println APPLY LIT("Hello"))

FOR(
  VALFROM("i") := LIT(0) INT_TO LIT(2),
  IF(REF("x") INT_< LIT(10))
) DO (Predef_println APPLY LIT("Hello"))

FOR(
  VALFROM("i") := LIT(0) INT_TO LIT(2),
  VAL("x") := REF("i")
) DO BLOCK(
  Predef_println APPLY LIT("Hello")
)

These examples print as:

for (i <- 1 to 2)
  println("Hello")

for {
  i <- 0 to 2
  if x < 10
} println("Hello")

for {
  i <- 0 to 2
  x = i
} {
  println("Hello")
}

FOR(...) takes vararg of enumerators which are constructed using VALFROM(sym|"x") := tree, IF(tree), or VAL(sym|"x") := tree.

For comprehension

For comprehensions are written using FOR(enum, ...) YEILD tree:

FOR(VALFROM("i") := LIT(0) INT_TO LIT(2)) YIELD (
  REF("i"))

This prints as:

for (i <- 0 to 2)
  yield i

Return Expressions and Exception Handling

Return expression

Return exressions are written using RETURN(tree):

RETURN(LIT(0))               // return 0

Throwing exceptions

Exceptions are thrown using THROW(sym|c, [tree|"message"]):

THROW(IllegalArgumentExceptionClass)
THROW(IllegalArgumentExceptionClass, "oh no")
THROW(IllegalArgumentExceptionClass, REF("x"))

These examples print as:

throw new IllegalArgumentException()
throw new IllegalArgumentException("oh no")
throw new IllegalArgumentException(x.toString)

Catching exceptions

Exceptions are caught using TRY(stat, ...) CATCH(CASE(pat), ...) [ENDTRY|FINALLY(...)]. CASE(...) accepts a pattern matching expression, which is shown briefly here, and covered in detail later:

(TRY (REF("something") APPLY LIT(0))
CATCH (
  CASE(WILDCARD) ==> (Predef_println APPLY LIT("error"))
) ENDTRY)

(TRY (REF("something") APPLY LIT(0))
CATCH (
  CASE(WILDCARD) ==> (Predef_println APPLY LIT("error"))
) FINALLY(Predef_println APPLY LIT("finally")))

In the above examples, WILDCARD is a pattern expression that matches to anything. The examples print as:

try {
  something(0)
} catch {
  case _ => println("error")
}

try {
  something(0)
} catch {
  case _ => println("error")
} finally println("finally")

Anonymous Functions

Anonymous functions are written using LAMBDA(PARAM(...), ...) ===> rhs:

LAMBDA(PARAM("x")) ==> (REF("x") INT_+ LIT(1))

LAMBDA(PARAM("x", IntClass)) ==> BLOCK(
  REF("x") INT_+ LIT(1))

These examples print as:

x => x + 1

{ (x: Int) =>
  x + 1
}

As the second example shows, when the right-hand side tree is a block, the entire anonymous function is wrapped in a block, which makes it convenient to be used with higher-order functions.

Wildcard parameter

The parameter list of anonymous functions may be WILDCARD:

LAMBDA(PARAM(WILDCARD)) ==> (REF("x") INT_+ LIT(1))

This prints as:

_ => x + 1

Placeholder syntax

In addition, an anonymous functions are formed when an expression contains WILDCARD.

WILDCARD INT_+ WILDCARD      // _ + _

Type-Level Expressions

Type references

Use TYPE_REF(tree|sym|"C") to explicitly convert symbols, names, and trees into types:

(VAL("pos", TYPE_REF(REF("board") DOT "Coord")): Tree)

This prints as:

val pos: board.Coord

Applied types

Applied types are written by calling TYPE_OF(typ|"C", ...) on a type:

REF("x") withType(ListClass TYPE_OF IntClass)

This prints as:

(x: List[Int])

Type constructors

treehugger DSL provides built-in type constructors, which will be covered more later:

REF("x") withType(TYPE_LIST(IntClass))
REF("y") withType(TYPE_TUPLE(IntClass, IntClass))
REF("z") withType(IntClass TYPE_=> IntClass)

These examples print as:

(x: List[Int])
(y: (Int, Int))
(z: Int => Int)

Refined types

Refined types are written by calling TYPE_WITH (typ|"C", ...) on a type:

(VAL("x", TYPE_REF("A") TYPE_WITH "B"): Tree)

This prints as:

val x: A with B

Singleton types

Singleton type are written using TYPE_SINGLETON(tree):

(VAL("x", TYPE_SINGLETON(THIS)): Tree)

This prints as:

val x: this.type

Structural types

Structural types are written using TYPE_STRUCT(tree, ...):

REF("x") withType(TYPE_STRUCT(
  DEF("close", UnitClass)
))

This prints as:

(x: ({ def close: Unit }))

Type projections

Type projections are written by calling TYPE_# (typ|"C") on a type:

REF("x") withType(TYPE_STRUCT(
  TYPEVAR("L") withTypeParams(TYPEVAR("A")) :=
    REF("Const") APPLYTYPE ("M", "A")
) TYPE_#("L"))

This prints as:

(foo: ({ type L[A] = Const[M, A] })#L)

Existential types

Existential types are written by calling TYPE_FORSOME(tree, ...) on a type:

(DEF("foo")
  withParams(PARAM("arg", TYPE_LIST(
    TYPE_REF(REF("x") DOT "T")) TYPE_FORSOME(
    VAL("x", "Outer")
)))).tree

This prints as:

def foo(arg: List[x.T] forSome { val x: Outer })

Implicits

Implicit modifier

Implicit values are written using withFlags(Flags.IMPLICIT):

(DEF("intToRational") withFlags(Flags.IMPLICIT)
  withParams(PARAM("x", IntClass)) := NEW("Rational", REF("x")))

This print as:

implicit def intToRational(x: Int) = new Rational(x)

Implicit parameter

Implicit parameters are also written using withFlags(Flags.IMPLICIT):

(DEF("greet")
    withParams(PARAM("name", StringClass))
    withParams(PARAM("config", "Config")
      withFlags(Flags.IMPLICIT)) := BLOCK(
  Predef_println APPLY(REF("config") APPLY REF("name"))
))

This prints as:

def greet(name: String)(implicit config: Config) {
  println(config(name))
}

View bounds

View bounds are written by calling VIEWBOUNDS(typ|"T") on TYPEVAR(...):

(DEF("maxList", "T")
  withTypeParams(TYPEVAR("T") VIEWBOUNDS TYPE_ORDERED("T"))
  withParams(PARAM("elements", TYPE_LIST("T"))): Tree)

This prints as:

def maxList[T <% Ordered[T]](elements: List[T]): T

Context bounds

Context bounds are written by calling CONTEXTBOUNDS(typ|"T") on TYPEVAR(...):

(DEF("put", UnitClass)
  withTypeParams(TYPEVAR(sym.A) CONTEXTBOUNDS FullManifestClass)
  withParams(PARAM("x", sym.A)): Tree)

This prints as:

def put[A : Manifest](x: A): Unit

Pattern Matching

Pattern matching is written using tree MATCH(CASE(...), ...):

DEF("maxList", "T")
    withTypeParams(TYPEVAR("T") VIEWBOUNDS TYPE_ORDERED("T"))
    withParams(PARAM("elements", TYPE_LIST("T"))) :=
  REF("elements") MATCH (
    CASE(ListClass UNAPPLY()) ==>
      THROW(IllegalArgumentExceptionClass, "empty list!"),
    CASE(ListClass UNAPPLY(ID("x"))) ==> REF("x"),
    CASE(ID("x") UNLIST_:: ID("rest")) ==> BLOCK(
      VAL("maxRest") := REF("maxList") APPLY(REF("rest")),
      IF(REF("x") INT_> REF("maxRest")) THEN REF("x")
      ELSE REF("maxRest") 
    )
  )

This prints as:

def maxList[T <% Ordered[T]](elements: List[T]): T =
  elements match {
    case List() =>
      throw new IllegalArgumentException("empty list!")
    case List(x) => x
    case x :: rest => {
      val maxRest = maxList(rest)
      if (x > maxRest) x
      else maxRest
    }
  }

Patterns

Let’s look into the pattern expressions.

Variable patterns

Variable patterns are written as ID(sym|"x") and WILDCARD. They both match any value:

ID("x")                      // x
WILDCARD                     // _

Typed patterns

Typed patterns are written as pat withType(typ|"C"):

ID("x") withType(IntClass)   // (x: Int)
WILDCARD withType(IntClass)  // (_: Int)

Pattern binders

Pattern binders are written as pat withBinder(sym|"x"). Binders are used to give names to patterns:

WILDCARD withBinder("x")     // (x @ _)

Literal patterns

Literal patterns are written using LIT(...):

LIT("X")                     // "X"

Stable identifier patterns

Stable identifier patterns are written using BACKQUOTED(sym|"x"):

BACKQUOTED("x")              // `x`

Constructor patterns

Constructor patterns are written by calling UNAPPLY(pattern, ...) to a symbol or a tree:

REF("Address") UNAPPLY(WILDCARD, WILDCARD, WILDCARD)

This prints as:

Address(_, _, _)

Tuple patterns

Tuple patterns are written using TUPLE(pattern, ...):

TUPLE(LIT(0), LIT(1))        // (0, 1)

Pattern sequences

Sequence wildcards are written using SEQ_WILDCARD withBinder(sym|"x"):

REF("C") UNAPPLY(SEQ_WILDCARD withBinder("xs"))

This prints as:

C((xs @ _*))

Infix operation patterns

Infix operations patterns are written by calling INFIX(op) UNAPPLY(pat, ...) to a symbol or a tree:

LIT(0) INFIX(ConsClass) UNAPPLY (NIL)

This prints as:

0 :: Nil

Because this pattern appears frequently, treehugger DSL provides a built-in constructor UNLIST_:::

LIT(0) UNLIST_:: NIL         // 0 :: Nil

This works because an infix operation pattern p op (q, …) is a shorthand for the constructor pattern op(p, q, …).

Pattern alternatives

Pattern alternatives are written as pat1 OR_PATTERN pat2:

LIT("New York") OR_PATTERN LIT("Paris")

This prints as:

"New York" | "Paris"

Pattern Matching Expressions

Pattern matching expressions are written by calling MATCH(CASE(pattern), ...) on a symbol or a tree:

REF("x") MATCH(
  CASE (LIT(0) OR_PATTERN LIT(1)) ==> TRUE,
  CASE (WILDCARD) ==> FALSE
)

This prints as:

x match {
  case 0 | 1 => true
  case _ => false
}

Just as we saw in anonymous functions, treehugger DSL uses ==> to denote Scala’s =>.

Guarded patterns

Optinally, a guard clause may be added to the case clause using IF(...):

REF("x") MATCH(
  CASE (ID("x"),
    IF(REF("x") INT_< LIT(10))) ==> TRUE,
  CASE (WILDCARD) ==> FALSE
)

This prints as:

x match {
  case x if x < 10 => true
  case _ => false
}

Case Sequence Functions

Case sequence functions are defined by listing CASE(...) clauses in a BLOCK(...):

BLOCK(
  CASE(SOME(ID("x"))) ==> REF("x"),
  CASE(NONE) ==> LIT(0)
)

This prints as:

{
  case Some(x) => x
  case None => 0
}

Pattern Values

Pattern values are defined by placing a pattern in VAL(...) or VAR(...):

VAL(REF("Address") UNAPPLY
  (ID("name"), ID("street"), ID("city"))) := REF("x")
VAR(SOME(ID("y"))) := SOME(LIT(1))

These examples print as:

val Address(name, street, city) = x
var Some(y) = Some(1)

Top-Level Definitions

Top-level definitions consists of compilation units, packagings, and package objects.

Compilation Units

Compilation units are written by calling inPackage(sym|"p") or withoutPackage on BLOCK(...):

BLOCK(
  OBJECTDEF("M")
) inPackage("p")

This prints as:

package p

object M
BLOCK(
  OBJECTDEF("M1"),
  OBJECTDEF("M2")
) withoutPackage

This prints as:

object M1

object M2

Packaging

Packagings are written using PACKAGE(sym|"p") as follows:

PACKAGE("p") := BLOCK(
  OBJECTDEF("M")
)

This prints as:

package p {
  object M
}

Package Objects

Package objects are written using PACKAGEOBJECTDEF(...):

PACKAGEOBJECTDEF("p") := BLOCK(
  OBJECTDEF("M")
)

This prints as:

package object p {
  object M 
}

Annotations

Annotations are applied to definitions, declarations, types, or expressions.

Declaration annotations

Annotations are applied to declarations using withAnnots(annot, ...). This takes vararg of AnnotationInfo, which is created using ANNOT(typ|"C", arg, ...).

CLASSDEF("C")
    withAnnots(ANNOT(SerializableAttr)) := BLOCK(
  DEF("get", IntClass) := LIT(0)
)

This prints as:

@serializable class C {
  def get: Int = 0
}

Type annotations

Type annotations are written by calling withAnnots(annot, ...) on TYPEVAR(...).

val annot = ANNOT(SpecializedClass, REF(IntClass))

TRAITDEF("Function0")
    withTypeParams(TYPEVAR("T") withAnnots(annot)) := BLOCK(
  DEF("apply", "T")
)

This prints as:

trait Function0[@specialized(Int) T] {
  def apply: T
}

Annotated expressions

Annotated expressions are written as tree withAnnots(annot, ...):

REF("e") withAnnots(ANNOT(UncheckedClass))

This prints as:

(e: @unchecked)

Standard Library

treehugger DSL provides some built-in functions for commonly used functions.

Value Class Operators

Here are the operators that form infix application tree when called on a symbol or a tree.

Boolean operators

lhs OR rhs                   // ||
lhs AND rhs                  // &&

Any operators

lhs ANY_== rhs               // ==
lhs ANY_!= rhs               // !=
lhs ANY_-> rhs               // ->

AnyRef operators

lhs OBJ_EQ rhs               // eq
lhs OBJ_NE rhs               // ne

Numeric operators

lhs INT_| rhs                // |
lhs INT_& rhs                // &
lhs INT_< rhs                // <
lhs INT_> rhs                // >
lhs INT_<= rhs               // <=
lhs INT_>= rhs               // >=
lhs INT_+ rhs                // +
lhs INT_- rhs                // -
lhs INT_* rhs                // *
lhs INT_/ rhs                // /
lhs INT_TO rhs               // TO

Collection Class Operators

List operators

lhs LIST_:: rhs              // ::
lhs LIST_::: rhs             // :::
lhs SEQ_++ rhs               // ++
lhs SEQ_/: rhs               // /:
lhs SEQ_:\ rhs               // :\

Collection operators

When the right-hand side expression is an anonymous function block or a case sequence function, the collection operators form an infix application tree. Otherwise, they form a selection or an application tree:

REF("list") MAP LAMBDA(PARAM("x")) ==> BLOCK(
  REF("x") INT_+ LIT(1)
)

REF("list") MAP (WILDCARD INT_+ LIT(1))

These print as:

list map { x =>
  x + 1
}

foo.map(_ + 1)

Here are the operators:

lhs FOREACH rhs              // foreach
lhs MAP rhs                  // map
lhs FLATMAP rhs              // flatMap
lhs COLLECT rhs              // collect
lhs FIND rhs                 // find
lhs SLICE (from, to)         // slice
lhs TAKE rhs                 // take
lhs DROP rhs                 // drop
lhs TAKEWHILE rhs            // take
lhs DROPWHILE rhs            // drop
lhs FILTER rhs               // filter
lhs WITHFILTER rhs           // withFilter
lhs FILTERNOT rhs            // filterNot
lhs SPLITAT rhs              // splitAt
lhs SPAN rhs                 // span
lhs PARTITION rhs            // partition
lhs GROUPBY rhs              // groupBy
lhs FORALL rhs               // forall
lhs EXISTS rhs               // exists
lhs COUNT rhs                // count
lhs FOLDLEFT rhs             // foldLeft
lhs FOLDRIGHT rhs            // foldRight
lhs REDUCELEFT rhs           // reduceLeft
lhs REDUCERIGHT rhs          // reduceRight

Collection Constructors

Here are built-in constructors and values:

LIST(arg, ...)               // List
NIL                          // Nil
pat1 UNLIST_:: pat2          // ::
SOME(arg, ...)               // Some
NONE                         // None
RIGHT(arg)                   // Right
LEFT(arg)                    // Left
ARRAY(arg, ...)              // Array
SEQ(arg, ...)                // Seq
VECTOR(arg, ...)             // Vector
MAKE_MAP(k ANY_-> v, ...)    // Map

Type Constructors

Here are built-in type constructors:

TYPE_ITERATOR(typ)           // Iterator[A]
TYPE_TUPLE(typ, ...)         // Tuple2[A, B, ...]
TYPE_ARRAY(typ)              // Array[A]
TYPE_LIST(typ)               // List[A]
TYPE_SEQ(typ)                // Seq[A]
TYPE_VECTOR(typ)             // Vector[A]
TYPE_MAP(typ1, typ2)         // Map[A, B]
TYPE_OPTION(typ)             // Option[A]
TYPE_EITHER(typ1, typ2)      // Either[A, B]
TYPE_RIGHT(typ1, typ2)       // Right[A, B]
TYPE_LEFT(typ1, typ2)        // Left[A, B]
TYPE_SOME(typ)               // Some[A]
TYPE_ORDERED(typ)            // Ordered[A]
typ1 TYPE_=> typ2            // Function1[A, R]
TYPE_FUNCTION(typ, ..., r)   // Function1[A, R]
TYPE_=:=(typ1, typ2)         // =:=[A, B]
TYPE_<:<(typ1, typ2)         // <:<[A, B]
TYPE_<%<(typ1, typ2)         // <%<[A, B]