Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Dada Language Specification

This document specifies the Dada programming language. It serves as the authoritative reference for language behavior and implementation requirements.

Purpose

The specification defines the syntax, semantics, and behavior of Dada programs. It is intended for:

  • Language implementers
  • Tool developers
  • Advanced users seeking precise language details

Relationship to RFCs

This specification incorporates accepted RFCs. Non-normative references to RFCs provide historical context and rationale for design decisions.

Conventions

This chapter describes the conventions used throughout this specification.

Paragraph References

Specification paragraphs use MyST directive syntax with the {spec} directive:

<div class="spec-paragraph" id="conventions.paragraph-references.local-name">
<div class="spec-label"><a href="#conventions.paragraph-references.local-name" class="spec-label-link">conventions.paragraph-references.local-name</a> <span class="spec-rfc-badge">rfc123</span></div>
<div class="spec-content">

Paragraph content.

</div>
</div>

ID Resolution

Paragraph IDs are resolved automatically from context:

  1. File path: syntax/string-literals.md contributes prefix syntax.string-literals
  2. Section headings: ## Escape Sequences contributes segment escape-sequences
  3. Local name: The name in the :::{spec} directive (e.g., invalid)

These combine to form the full ID: syntax.string-literals.escape-sequences.invalid

The local name is optional. A directive with only tags uses the heading context as its ID:

## Type

<div class="spec-paragraph" id="conventions.type">
<div class="spec-label"><a href="#conventions.type" class="spec-label-link">conventions.type</a> <span class="spec-rfc-badge">rfc0001</span> <span class="spec-rfc-badge spec-rfc-unimpl">unimpl</span></div>
<div class="spec-content">

String literals have type `my String`.

</div>
</div>

This paragraph’s ID is syntax.string-literals.type (file prefix + heading).

Inline Sub-paragraphs

List items within a :::{spec} block can be marked as individually referenceable sub-paragraphs using the {spec}`name` syntax:

<div class="spec-paragraph" id="conventions.type.inline-sub-paragraphs">
<div class="spec-label"><a href="#conventions.type.inline-sub-paragraphs" class="spec-label-link">conventions.type.inline-sub-paragraphs</a> <span class="spec-rfc-badge">rfc0001</span> <span class="spec-rfc-badge spec-rfc-unimpl">unimpl</span></div>
<div class="spec-content">

There are multiple forms of string literals:

* <span id="conventions.type.inline-sub-paragraphs.quoted" class="spec-sub-paragraph"><a href="#conventions.type.inline-sub-paragraphs.quoted" class="spec-sub-label">.quoted</a></span> Single-quoted string literals begin with `"` and end with `"`.
* <span id="conventions.type.inline-sub-paragraphs.triple-quoted" class="spec-sub-paragraph"><a href="#conventions.type.inline-sub-paragraphs.triple-quoted" class="spec-sub-label">.triple-quoted</a></span> Triple-quoted string literals begin with `"""` and end with `"""`.

</div>
</div>

Under ## Delimiters in syntax/string-literals.md, this creates:

  • syntax.string-literals.delimiters (parent paragraph)
  • syntax.string-literals.delimiters.quoted (sub-paragraph)
  • syntax.string-literals.delimiters.triple-quoted (sub-paragraph)

Each sub-paragraph gets its own linkable anchor in the rendered output.

RFC and Status Annotations

Paragraphs include tags after the optional local name:

<div class="spec-paragraph" id="conventions.type.rfc-and-status-annotations.local-name">
<div class="spec-label"><a href="#conventions.type.rfc-and-status-annotations.local-name" class="spec-label-link">conventions.type.rfc-and-status-annotations.local-name</a> <span class="spec-rfc-badge">rfc123</span> <span class="spec-rfc-badge spec-rfc-unimpl">unimpl</span></div>
<div class="spec-content">

Content added by RFC 123, not yet implemented.

</div>
</div>

Available tags:

  • rfcN — content added or modified by RFC N
  • !rfcN — content deleted by RFC N
  • unimpl — specified but not yet implemented

Multiple tags can be combined: :::{spec} local-name rfc123 rfc456 unimpl

Test Annotations

Tests reference spec paragraphs using #:spec comments with the fully-qualified ID:

#:spec syntax.string-literals.delimiters.quoted

These labels serve multiple purposes:

  • Cross-referencing within the specification
  • Linking from RFC documents
  • Test validation via #:spec annotations in .dada test files

Identifiers use semantic names rather than numbers to remain stable as the specification evolves.

EBNF Notation

This specification uses Extended Backus-Naur Form (EBNF) to describe syntax. Standard EBNF operators apply:

  • A* — zero or more repetitions of A
  • A+ — one or more repetitions of A
  • A? — optional A
  • A | B — A or B
  • `keyword` — a literal terminal
  • ε — the empty production

In addition, this specification uses the following shorthand for comma-separated lists with optional trailing commas:

  • A,* — zero or more comma-separated occurrences of A
  • A,+ — one or more comma-separated occurrences of A

Normative Language

This specification uses the following terms to indicate requirements:

  • must: An absolute requirement
  • must not: An absolute prohibition
  • should: A strong recommendation
  • should not: A strong recommendation against
  • may: An optional feature or behavior

Syntax

This chapter describes the lexical structure and grammar of Dada programs.

Lexical Structure

This chapter specifies the lexical structure of Dada programs. A Dada source file is a sequence of Unicode characters, which the lexer converts into a sequence of tokens.

Source Encoding

Dada source files are encoded as UTF-8.

Tokens

The lexer produces a sequence of tokens:


Token ::= Identifier
          | Keyword
          | Literal
          | Operator
          | Delimiter

A token Token is one of the following kinds:

Each token records whether it was preceded by whitespace, a newline, or a comment. This information is used by the parser but does not produce separate tokens.

Whitespace and Comments

Whitespace

Whitespace characters (spaces, tabs, and other Unicode whitespace excluding newlines) separate tokens but are otherwise not significant.

Newline characters (\n) are tracked by the lexer. Whether a token is preceded by a newline may affect how the parser interprets certain constructs.

Comments

A comment begins with # and extends to the end of the line.

The content of a comment, including the leading #, is ignored by the lexer. A comment implies a newline for the purpose of preceding-whitespace tracking.

Identifier definition

An identifier Identifier begins with a Unicode alphabetic character or underscore (_), followed by zero or more Unicode alphanumeric characters or underscores, provided it is not a keyword Keyword:


Identifier ::= (Alphabetic | _) (Alphanumeric | _)*    (not a Keyword)

Keyword definition

The following words are reserved as keywords:


Keyword ::= as
            | async
            | await
            | class
            | else
            | enum
            | export
            | false
            | fn
            | give
            | given
            | if
            | is
            | let
            | match
            | mod
            | mut
            | my
            | our
            | perm
            | pub
            | ref
            | return
            | self
            | share
            | shared
            | struct
            | true
            | type
            | unsafe
            | use
            | where

Operator definition

The following single characters are recognized as operator tokens:


Operator ::= + | - | * | / | % | = | !
           | < | > | & | | | : | , | . | ; | ?

Multi-character operators such as &&, ||, ==, <=, >=, and -> are formed by the parser from adjacent operator tokens.

Delimiter definition

A delimited token contains a matched pair of brackets and their contents:


Delimiter ::= ( Token* ) | [ Token* ] | { Token* }

Delimiters must be balanced. An opening delimiter without a matching closing delimiter is an error.

The lexer tracks delimiter nesting. Content between matching delimiters is treated as a unit, which enables deferred parsing of function bodies and other nested structures.

Literal definition

IntegerLiteral definition

An integer literal IntegerLiteral is a sequence of one or more ASCII decimal digits (09), optionally separated by underscores (_) that do not affect the value:


IntegerLiteral ::= Digit (_? Digit)*
Digit ::= 0 | 1 | ... | 9

BooleanLiteral definition

The keywords true and false are boolean literals:


BooleanLiteral ::= true | false

StringLiteral definition

Lexical Errors

Characters that do not begin a valid token are accumulated and reported as a single error spanning the invalid sequence.

Items

This chapter specifies the top-level items that can appear in a Dada source file.

Source Files

A Dada source file defines a module. The module name is derived from the file name. A source file contains zero or more items, optionally followed by zero or more statements:


SourceFile ::= Item* Statement*

If a source file contains top-level statements, they are wrapped in an implicit async fn main() function.

An item Item is one of the following:


Item ::= Function
         | Class
         | Struct
         | UseDeclaration

Visibility definition

Items and fields may have a visibility modifier. Without a modifier, the item is private to the enclosing module.


Visibility ::= pub
               | export
               | ε

Function definition

A function Function is declared with the fn keyword, optionally preceded by effect keywords and followed by a name, optional generic parameters, parameters, optional return type, optional where clause, and a body or semicolon:


Function ::= Visibility Effect* fn Identifier GenericParameters?
             ( Parameters ) ReturnType? WhereClause? FunctionBody

Effect definition

Effect keywords may appear in any order before fn:


Effect ::= async
           | unsafe

Parameters definition

Function parameters are enclosed in parentheses and separated by commas:


Parameters ::= FunctionInput,*
FunctionInput ::= SelfParameter | Parameter

A function may have a self parameter as its first parameter, optionally preceded by a permission keyword, which makes it a method:


SelfParameter ::= PermissionKeyword? self

Each non-self parameter has the form name: Type. A parameter may be preceded by mut to declare a mutable binding:


Parameter ::= mut? Identifier : Type

FunctionBody definition

A function may have a body, which is a block enclosed in curly braces. If no body is present, the function has no definition.


FunctionBody ::= Block | ε

ReturnType definition

A function may declare a return type with -> followed by a Type after the parameters.

ReturnType ::= -> Type

GenericParameters definition

A function may declare generic parameters in square brackets after the name:


GenericParameters ::= [ GenericParameter,* ]
GenericParameter ::= type Identifier | perm Identifier

WhereClause definition

A function may have a where clause after the return type that constrains its generic parameters:


WhereClause ::= where WhereConstraint,+
WhereConstraint ::= Type is WhereKind (+ WhereKind)*
WhereKind ::= ref
              | mut
              | shared
              | unique
              | owned
              | lent

Class definition

A class Class is declared with the class keyword. Classes have reference semantics.


Class ::= Visibility class Identifier GenericParameters?
          ConstructorFields? WhereClause? ClassBody?

ConstructorFields definition

A class may declare constructor fields in parentheses after the name:


ConstructorFields ::= ( Field,* )

ClassBody definition

A class body enclosed in curly braces may contain field declarations and method definitions:


ClassBody ::= { ClassMember* }
ClassMember ::= Field
                | Method

Method definition

A method Method is a function declared inside a class or struct body:


Method ::= Function

Field definition

A field declaration Field has the form:


Field ::= Visibility mut? Identifier : Type

Generics and Where Clauses

Classes support generic parameters and where clauses with the same syntax as functions.

Struct definition

A struct Struct is declared with the struct keyword. The syntax is identical to Class but structs have value semantics.


Struct ::= Visibility struct Identifier GenericParameters?
           ConstructorFields? WhereClause? ClassBody?

UseDeclaration definition

A use declaration UseDeclaration imports a name from another crate, optionally renaming it with as:


UseDeclaration ::= use Path (as Identifier)?
Path ::= Identifier (. Identifier)*

Statements

This chapter specifies the statement syntax of Dada.

Block definition

A block Block is a sequence of zero or more statements enclosed in curly braces:


Block ::= { Statement* }

A block evaluates to the value of its last expression, if the last statement is an expression statement.

Statement definition

A statement Statement is one of the following:


Statement ::= LetStatement
              | ExprStatement

LetStatement definition

A let statement LetStatement introduces a new variable binding:


LetStatement ::= let mut? Identifier (: Type)? (= Expr)?

A let statement may include a type annotation: let name: Type = value.

A let statement may use mut to declare a mutable binding: let mut name = value.

The initializer (= value) is optional. A variable may be declared without an initial value.

ExprStatement definition

An expression statement ExprStatement is an expression followed by a newline or end of block:


ExprStatement ::= Expr

Expressions

This chapter specifies the expression syntax of Dada.

Expr definition

An expression Expr is parsed using precedence climbing. From lowest to highest precedence:


Expr ::= AssignExpr

AssignExpr definition

The assignment operator = assigns a value to a place expression. It has the lowest precedence among binary operators:


AssignExpr ::= OrExpr

OrExpr definition

The logical OR operator || performs short-circuit boolean logic:


OrExpr ::= AndExpr

AndExpr definition

The logical AND operator && performs short-circuit boolean logic:


AndExpr ::= CompareExpr

CompareExpr definition

The comparison operators compare two values and produce a boolean result:


CompareExpr ::= AddExpr

CompareOp ::= == | < | > | <= | >=

AddExpr definition

The additive operators perform addition and subtraction:


AddExpr ::= MulExpr

MulExpr definition

The multiplicative operators perform multiplication and division:


MulExpr ::= UnaryExpr

UnaryExpr definition

A unary expression applies a prefix operator to a postfix expression:


UnaryExpr ::= UnaryOp* PostfixExpr
UnaryOp ::= ! | -
  • .not ! performs logical negation.
  • .negate - performs arithmetic negation.

Newline Sensitivity

A binary operator must appear on the same line as its left operand. An operator on a new line begins a new expression or is interpreted as a prefix operator.

PostfixExpr definition

A postfix expression applies zero or more postfix operators to a primary expression:


PostfixExpr ::= PrimaryExpr PostfixOp*

PostfixOp definition

A postfix operator PostfixOp is one of the following:


PostfixOp ::= FieldAccess
              | Call
              | Await
              | PermissionOp

FieldAccess definition

A field access FieldAccess uses dot notation to access a field or name a method:


FieldAccess ::= . Identifier

Call definition

A function or method call Call follows an expression with parenthesized arguments separated by commas. The opening parenthesis must appear on the same line as the callee:


Call ::= ( Expr,* )

Await definition

The .await postfix operator awaits the result of a future:


Await ::= . await

PermissionOp definition

A permission operation PermissionOp requests specific permissions on a value:


PermissionOp ::= give
                 | share
                 | lease
                 | ref

PrimaryExpr definition

A primary expression PrimaryExpr is one of the following:


PrimaryExpr ::= Literal
                | identifier
                | self
                | IfExpr
                | ReturnExpr
                | ConstructorExpr
                | paren-expr
                | block-expr

IfExpr definition

An if expression IfExpr evaluates a condition and executes a block:


IfExpr ::= if Expr Block (else if Expr Block)* (else Block)?

An if expression may have an else clause.

Multiple conditions may be chained with else if.

ReturnExpr definition

A return expression ReturnExpr exits the enclosing function, optionally with a value. The value, if present, must appear on the same line as return:


ReturnExpr ::= return Expr?

ConstructorExpr definition

A constructor expression ConstructorExpr creates a new instance of a class or struct. The opening brace must appear on the same line as the type name:


ConstructorExpr ::= Identifier { ConstructorField,* }
ConstructorField ::= Identifier : Expr

Types and Permissions

This chapter specifies the syntax for types and permissions in Dada.

Types

Named Types

A type may be a simple name: String, i32, bool.

A type may be a dotted path: module.Type.

Generic Application

A type may be applied to generic arguments in square brackets: Vec[String], Pair[i32, bool].

Permission-Qualified Types

A type may be preceded by a permission to form a permission-qualified type: my String, ref Point, mut Vec[i32].

Permissions

The following permission keywords are available:

  • .my my — exclusive ownership.
  • .our our — shared ownership.
  • .ref ref — immutable reference.
  • .mut mut — mutable reference.
  • .given given — a permission supplied by the caller.

Place Lists

The permissions ref, mut, and given may include a place list in square brackets specifying which places they refer to: ref[x, y], mut[self], given[p].

The place list is optional. When omitted, the permission applies without place restrictions.

Generic Declarations

In Type Position

A generic type parameter is declared as type T.

Ambiguity

A single identifier in a generic position is ambiguous between a type and a permission. The ambiguity is resolved during type checking, not parsing.

Literals

This chapter describes literal expressions in Dada.

Numeric Literals

See Integer Literals for lexical syntax.

Numeric type inference and overflow behavior to be specified.

Boolean Literals

See Boolean Literals for lexical syntax.

String Literals

See String Literals for detailed specification.

String Literals

This chapter specifies string literal syntax in Dada.

Delimiters

There are multiple forms of string literals:

  • .quoted Single-quoted string literals begin with a " and end with a ".
  • .triple-quoted Triple-quoted string literals begin with a """ and end with a """.

The syntax """ is interpreted as the start of a triple-quoted string literal and not a single-quoted string literal followed by the start of another single-quoted string literal.

A triple-quoted string literal cannot contain three consecutive unescaped double-quote characters.

Type

String literals have type my String.

Escape Sequences

String literals support the following escape sequences:

The \" escape sequence is not needed in triple-quoted strings, since embedded double quotes do not terminate the string.

A \ followed by a character not listed above is an error.

Interpolation

String literals may contain interpolation expressions delimited by curly braces ({ and }). Any valid Dada expression may appear inside the braces.

Literal brace characters are produced by the \{ and \} escape sequences.

The lexer tracks brace nesting depth, so that braces within interpolated expressions (e.g., block expressions, struct literals) do not prematurely terminate the interpolation.

Quotes inside interpolated expressions do not terminate the enclosing string literal.

Interpolated expressions are evaluated at runtime in the enclosing scope.

Interpolated expressions are evaluated left-to-right.

Each interpolated expression must produce a value that can be converted to a string. This is checked at compile time.

The permission system applies normally to interpolated expressions.

Multiline Strings

A string literal that begins with a newline immediately after the opening quote (either " or """) is a multiline string literal with automatic indentation handling.

The leading newline immediately after the opening quote is removed.

The trailing newline immediately before the closing quote is removed, along with any whitespace on the final line.

The common whitespace prefix across all non-empty lines is removed from the start of each line.

Escape sequences are part of the string content, not whitespace. They are not affected by leading/trailing stripping or dedenting.

A string literal beginning with "\ followed by a newline disables automatic dedenting. The string preserves its content exactly as written, including the leading newline and all indentation.

String Conversion

Interpolated expressions must produce values that can be converted to strings. The exact conversion mechanism is not yet defined and depends on Dada’s trait/interface system.

Implementation Notes

A string literal with no interpolation expressions can be compiled as a simple string constant with no runtime overhead.