Dada Language Specification
This document specifies the Dada programming language. It serves as the authoritative reference for language behavior and implementation requirements.
Purpose
The specification defines the syntax, semantics, and behavior of Dada programs. It is intended for:
- Language implementers
- Tool developers
- Advanced users seeking precise language details
Relationship to RFCs
This specification incorporates accepted RFCs. Non-normative references to RFCs provide historical context and rationale for design decisions.
Conventions
This chapter describes the conventions used throughout this specification.
Paragraph References
Specification paragraphs use MyST directive syntax with the {spec} directive:
<div class="spec-paragraph" id="conventions.paragraph-references.local-name">
<div class="spec-label"><a href="#conventions.paragraph-references.local-name" class="spec-label-link">conventions.paragraph-references.local-name</a> <span class="spec-rfc-badge">rfc123</span></div>
<div class="spec-content">
Paragraph content.
</div>
</div>
ID Resolution
Paragraph IDs are resolved automatically from context:
- File path:
syntax/string-literals.mdcontributes prefixsyntax.string-literals - Section headings:
## Escape Sequencescontributes segmentescape-sequences - Local name: The name in the
:::{spec}directive (e.g.,invalid)
These combine to form the full ID: syntax.string-literals.escape-sequences.invalid
The local name is optional. A directive with only tags uses the heading context as its ID:
## Type
<div class="spec-paragraph" id="conventions.type">
<div class="spec-label"><a href="#conventions.type" class="spec-label-link">conventions.type</a> <span class="spec-rfc-badge">rfc0001</span> <span class="spec-rfc-badge spec-rfc-unimpl">unimpl</span></div>
<div class="spec-content">
String literals have type `my String`.
</div>
</div>
This paragraph’s ID is syntax.string-literals.type (file prefix + heading).
Inline Sub-paragraphs
List items within a :::{spec} block can be marked as individually referenceable
sub-paragraphs using the {spec}`name` syntax:
<div class="spec-paragraph" id="conventions.type.inline-sub-paragraphs">
<div class="spec-label"><a href="#conventions.type.inline-sub-paragraphs" class="spec-label-link">conventions.type.inline-sub-paragraphs</a> <span class="spec-rfc-badge">rfc0001</span> <span class="spec-rfc-badge spec-rfc-unimpl">unimpl</span></div>
<div class="spec-content">
There are multiple forms of string literals:
* <span id="conventions.type.inline-sub-paragraphs.quoted" class="spec-sub-paragraph"><a href="#conventions.type.inline-sub-paragraphs.quoted" class="spec-sub-label">.quoted</a></span> Single-quoted string literals begin with `"` and end with `"`.
* <span id="conventions.type.inline-sub-paragraphs.triple-quoted" class="spec-sub-paragraph"><a href="#conventions.type.inline-sub-paragraphs.triple-quoted" class="spec-sub-label">.triple-quoted</a></span> Triple-quoted string literals begin with `"""` and end with `"""`.
</div>
</div>
Under ## Delimiters in syntax/string-literals.md, this creates:
syntax.string-literals.delimiters(parent paragraph)syntax.string-literals.delimiters.quoted(sub-paragraph)syntax.string-literals.delimiters.triple-quoted(sub-paragraph)
Each sub-paragraph gets its own linkable anchor in the rendered output.
RFC and Status Annotations
Paragraphs include tags after the optional local name:
<div class="spec-paragraph" id="conventions.type.rfc-and-status-annotations.local-name">
<div class="spec-label"><a href="#conventions.type.rfc-and-status-annotations.local-name" class="spec-label-link">conventions.type.rfc-and-status-annotations.local-name</a> <span class="spec-rfc-badge">rfc123</span> <span class="spec-rfc-badge spec-rfc-unimpl">unimpl</span></div>
<div class="spec-content">
Content added by RFC 123, not yet implemented.
</div>
</div>
Available tags:
rfcN— content added or modified by RFC N!rfcN— content deleted by RFC Nunimpl— specified but not yet implemented
Multiple tags can be combined: :::{spec} local-name rfc123 rfc456 unimpl
Test Annotations
Tests reference spec paragraphs using #:spec comments with the fully-qualified ID:
#:spec syntax.string-literals.delimiters.quoted
These labels serve multiple purposes:
- Cross-referencing within the specification
- Linking from RFC documents
- Test validation via
#:specannotations in.dadatest files
Identifiers use semantic names rather than numbers to remain stable as the specification evolves.
EBNF Notation
This specification uses Extended Backus-Naur Form (EBNF) to describe syntax. Standard EBNF operators apply:
A*— zero or more repetitions of AA+— one or more repetitions of AA?— optional AA | B— A or B`keyword`— a literal terminalε— the empty production
In addition, this specification uses the following shorthand for comma-separated lists with optional trailing commas:
A,*— zero or more comma-separated occurrences of AA,+— one or more comma-separated occurrences of A
Normative Language
This specification uses the following terms to indicate requirements:
- must: An absolute requirement
- must not: An absolute prohibition
- should: A strong recommendation
- should not: A strong recommendation against
- may: An optional feature or behavior
Syntax
This chapter describes the lexical structure and grammar of Dada programs.
Lexical Structure
This chapter specifies the lexical structure of Dada programs. A Dada source file is a sequence of Unicode characters, which the lexer converts into a sequence of tokens.
Source Encoding
Dada source files are encoded as UTF-8.
Tokens
The lexer produces a sequence of tokens:
Token ::= Identifier
| Keyword
| Literal
| Operator
| Delimiter
A token Token is one of the following kinds:
Each token records whether it was preceded by whitespace, a newline, or a comment. This information is used by the parser but does not produce separate tokens.
Whitespace and Comments
Whitespace
Whitespace characters (spaces, tabs, and other Unicode whitespace excluding newlines) separate tokens but are otherwise not significant.
Newline characters (\n) are tracked by the lexer.
Whether a token is preceded by a newline
may affect how the parser interprets certain constructs.
Comments
A comment begins with # and extends to the end of the line.
The content of a comment, including the leading #, is ignored by the lexer.
A comment implies a newline for the purpose of preceding-whitespace tracking.
Identifier definition
An identifier Identifier begins with a Unicode alphabetic character or underscore (_),
followed by zero or more Unicode alphanumeric characters or underscores,
provided it is not a keyword Keyword:
Identifier ::= (Alphabetic | _) (Alphanumeric | _)* (not a Keyword)
Identifiers are case-sensitive.
Keyword definition
The following words are reserved as keywords:
Keyword ::= as
| async
| await
| class
| else
| enum
| export
| false
| fn
| give
| given
| if
| is
| let
| match
| mod
| mut
| my
| our
| perm
| pub
| ref
| return
| self
| share
| shared
| struct
| true
| type
| unsafe
| use
| where
Operator definition
The following single characters are recognized as operator tokens:
Operator ::= + | - | * | / | % | = | !
| < | > | & | | | : | , | . | ; | ?
- .plus
+ - .minus
- - .star
* - .slash
/ - .percent
% - .equals
= - .bang
! - .less-than
< - .greater-than
> - .ampersand
& - .pipe
| - .colon
: - .comma
, - .dot
. - .semicolon
; - .question
?
Multi-character operators such as &&, ||, ==, <=, >=, and ->
are formed by the parser from adjacent operator tokens.
Delimiter definition
A delimited token contains a matched pair of brackets and their contents:
Delimiter ::= ( Token* ) | [ Token* ] | { Token* }
- .parentheses Parentheses:
(and). - .square-brackets Square brackets:
[and]. - .curly-braces Curly braces:
{and}.
Delimiters must be balanced. An opening delimiter without a matching closing delimiter is an error.
The lexer tracks delimiter nesting. Content between matching delimiters is treated as a unit, which enables deferred parsing of function bodies and other nested structures.
Literal definition
A literal Literal is one of the following:
Literal ::= IntegerLiteral
| BooleanLiteral
| StringLiteral
IntegerLiteral definition
An integer literal IntegerLiteral is a sequence of one or more ASCII decimal digits (0–9),
optionally separated by underscores (_) that do not affect the value:
IntegerLiteral ::= Digit (_? Digit)*
Digit ::= 0 | 1 | ... | 9
BooleanLiteral definition
The keywords true and false are boolean literals:
BooleanLiteral ::= true | false
StringLiteral definition
String literal syntax is specified in String Literals.
Lexical Errors
Characters that do not begin a valid token are accumulated and reported as a single error spanning the invalid sequence.
Items
This chapter specifies the top-level items that can appear in a Dada source file.
Source Files
A Dada source file defines a module. The module name is derived from the file name. A source file contains zero or more items, optionally followed by zero or more statements:
SourceFile ::= Item* Statement*
If a source file contains top-level statements,
they are wrapped in an implicit async fn main() function.
An item Item is one of the following:
Item ::= Function
| Class
| Struct
| UseDeclaration
Visibility definition
Items and fields may have a visibility modifier. Without a modifier, the item is private to the enclosing module.
Visibility ::= pub
| export
| ε
Function definition
A function Function is declared with the fn keyword,
optionally preceded by effect keywords
and followed by a name, optional generic parameters,
parameters, optional return type, optional where clause,
and a body or semicolon:
Function ::= Visibility Effect* fn Identifier GenericParameters?
( Parameters ) ReturnType? WhereClause? FunctionBody
Effect definition
Effect keywords may appear in any order before fn:
Effect ::= async
| unsafe
Parameters definition
Function parameters are enclosed in parentheses and separated by commas:
Parameters ::= FunctionInput,*
FunctionInput ::= SelfParameter | Parameter
A function may have a self parameter as its first parameter,
optionally preceded by a permission keyword,
which makes it a method:
SelfParameter ::= PermissionKeyword? self
Each non-self parameter has the form name: Type.
A parameter may be preceded by mut
to declare a mutable binding:
Parameter ::= mut? Identifier : Type
FunctionBody definition
A function may have a body, which is a block enclosed in curly braces. If no body is present, the function has no definition.
FunctionBody ::= Block | ε
ReturnType definition
A function may declare a return type with -> followed by a Type after the parameters.
ReturnType ::= -> Type
GenericParameters definition
A function may declare generic parameters in square brackets after the name:
GenericParameters ::= [ GenericParameter,* ]
GenericParameter ::= type Identifier | perm Identifier
- .type-parameters A type parameter
typefollowed by a name:type T. - .permission-parameters A permission parameter
permfollowed by a name:perm P.
WhereClause definition
A function may have a where clause after the return type
that constrains its generic parameters:
WhereClause ::= where WhereConstraint,+
WhereConstraint ::= Type is WhereKind (+ WhereKind)*
WhereKind ::= ref
| mut
| shared
| unique
| owned
| lent
Class definition
A class Class is declared with the class keyword.
Classes have reference semantics.
Class ::= Visibility class Identifier GenericParameters?
ConstructorFields? WhereClause? ClassBody?
ConstructorFields definition
A class may declare constructor fields in parentheses after the name:
ConstructorFields ::= ( Field,* )
ClassBody definition
Method definition
Field definition
A field declaration Field has the form:
Field ::= Visibility mut? Identifier : Type
Generics and Where Clauses
Classes support generic parameters and where clauses with the same syntax as functions.
Struct definition
A struct Struct is declared with the struct keyword.
The syntax is identical to Class but structs have value semantics.
Struct ::= Visibility struct Identifier GenericParameters?
ConstructorFields? WhereClause? ClassBody?
UseDeclaration definition
A use declaration UseDeclaration imports a name from another crate,
optionally renaming it with as:
UseDeclaration ::= use Path (as Identifier)?
Path ::= Identifier (. Identifier)*
Statements
This chapter specifies the statement syntax of Dada.
Block definition
A block evaluates to the value of its last expression, if the last statement is an expression statement.
Statement definition
A statement Statement is one of the following:
Statement ::= LetStatement
| ExprStatement
LetStatement definition
A let statement LetStatement introduces a new variable binding:
LetStatement ::= let mut? Identifier (: Type)? (= Expr)?
A let statement may include a type annotation: let name: Type = value.
A let statement may use mut to declare a mutable binding:
let mut name = value.
The initializer (= value) is optional.
A variable may be declared without an initial value.
ExprStatement definition
An expression statement ExprStatement is an expression
followed by a newline or end of block:
ExprStatement ::= Expr
Expressions
This chapter specifies the expression syntax of Dada.
Expr definition
An expression Expr is parsed using precedence climbing.
From lowest to highest precedence:
Expr ::= AssignExpr
AssignExpr definition
The assignment operator = assigns a value to a place expression.
It has the lowest precedence among binary operators:
AssignExpr ::= OrExpr
OrExpr definition
AndExpr definition
The logical AND operator && performs short-circuit boolean logic:
AndExpr ::= CompareExpr
CompareExpr definition
The comparison operators compare two values and produce a boolean result:
CompareExpr ::= AddExpr
CompareOp ::= == | < | > | <= | >=
AddExpr definition
MulExpr definition
UnaryExpr definition
A unary expression applies a prefix operator to a postfix expression:
UnaryExpr ::= UnaryOp* PostfixExpr
UnaryOp ::= ! | -
Newline Sensitivity
A binary operator must appear on the same line as its left operand. An operator on a new line begins a new expression or is interpreted as a prefix operator.
PostfixExpr definition
A postfix expression applies zero or more postfix operators to a primary expression:
PostfixExpr ::= PrimaryExpr PostfixOp*
PostfixOp definition
A postfix operator PostfixOp is one of the following:
PostfixOp ::= FieldAccess
| Call
| Await
| PermissionOp
FieldAccess definition
A field access FieldAccess uses dot notation to access a field or name a method:
FieldAccess ::= . Identifier
Call definition
Await definition
The .await postfix operator awaits the result of a future:
Await ::= . await
PermissionOp definition
A permission operation PermissionOp requests specific permissions on a value:
PermissionOp ::= give
| share
| lease
| ref
PrimaryExpr definition
A primary expression PrimaryExpr is one of the following:
PrimaryExpr ::= Literal
| identifier
| self
| IfExpr
| ReturnExpr
| ConstructorExpr
| paren-expr
| block-expr
IfExpr definition
An if expression may have an else clause.
Multiple conditions may be chained with else if.
ReturnExpr definition
A return expression ReturnExpr exits the enclosing function,
optionally with a value.
The value, if present, must appear on the same line as return:
ReturnExpr ::= return Expr?
ConstructorExpr definition
A constructor expression ConstructorExpr creates a new instance
of a class or struct.
The opening brace must appear on the same line as the type name:
ConstructorExpr ::= Identifier { ConstructorField,* }
ConstructorField ::= Identifier : Expr
Types and Permissions
This chapter specifies the syntax for types and permissions in Dada.
Types
Named Types
A type may be a simple name: String, i32, bool.
A type may be a dotted path: module.Type.
Generic Application
A type may be applied to generic arguments in square brackets:
Vec[String], Pair[i32, bool].
Permission-Qualified Types
A type may be preceded by a permission to form a permission-qualified type:
my String, ref Point, mut Vec[i32].
Permissions
Place Lists
The permissions ref, mut, and given may include a place list
in square brackets specifying which places they refer to:
ref[x, y], mut[self], given[p].
The place list is optional. When omitted, the permission applies without place restrictions.
Generic Declarations
In Type Position
A generic type parameter is declared as type T.
A generic permission parameter is declared as perm P.
Ambiguity
A single identifier in a generic position is ambiguous between a type and a permission. The ambiguity is resolved during type checking, not parsing.
Literals
This chapter describes literal expressions in Dada.
Numeric Literals
See Integer Literals for lexical syntax.
Numeric type inference and overflow behavior to be specified.
Boolean Literals
See Boolean Literals for lexical syntax.
String Literals
See String Literals for detailed specification.
String Literals
This chapter specifies string literal syntax in Dada.
Delimiters
There are multiple forms of string literals:
- .quoted Single-quoted string literals begin with a
"and end with a". - .triple-quoted Triple-quoted string literals begin with a
"""and end with a""".
The syntax """ is interpreted as the start of a triple-quoted string literal
and not a single-quoted string literal followed by the start of another single-quoted string literal.
A triple-quoted string literal cannot contain three consecutive unescaped double-quote characters.
Type
String literals have type my String.
Escape Sequences
String literals support the following escape sequences:
- .backslash
\\produces a literal backslash. - .double-quote
\"produces a literal double quote. - .newline
\nproduces a newline. - .carriage-return
\rproduces a carriage return. - .tab
\tproduces a tab. - .open-brace
\{produces a literal{. - .close-brace
\}produces a literal}.
The \" escape sequence is not needed in triple-quoted strings,
since embedded double quotes do not terminate the string.
A \ followed by a character not listed above is an error.
Interpolation
String literals may contain interpolation expressions
delimited by curly braces ({ and }).
Any valid Dada expression may appear inside the braces.
Literal brace characters are produced by the \{ and \} escape sequences.
The lexer tracks brace nesting depth, so that braces within interpolated expressions (e.g., block expressions, struct literals) do not prematurely terminate the interpolation.
Quotes inside interpolated expressions do not terminate the enclosing string literal.
Interpolated expressions are evaluated at runtime in the enclosing scope.
Interpolated expressions are evaluated left-to-right.
Each interpolated expression must produce a value that can be converted to a string. This is checked at compile time.
The permission system applies normally to interpolated expressions.
Multiline Strings
A string literal that begins with a newline immediately after the opening quote
(either " or """) is a multiline string literal
with automatic indentation handling.
The leading newline immediately after the opening quote is removed.
The trailing newline immediately before the closing quote is removed, along with any whitespace on the final line.
The common whitespace prefix across all non-empty lines is removed from the start of each line.
Escape sequences are part of the string content, not whitespace. They are not affected by leading/trailing stripping or dedenting.
A string literal beginning with "\ followed by a newline
disables automatic dedenting.
The string preserves its content exactly as written,
including the leading newline and all indentation.
String Conversion
Interpolated expressions must produce values that can be converted to strings. The exact conversion mechanism is not yet defined and depends on Dada’s trait/interface system.
Implementation Notes
A string literal with no interpolation expressions can be compiled as a simple string constant with no runtime overhead.