Let's take our first steps into programming with smoΞ» (pronounced like "small" but with "o" instead of "a"). The language simplifies a lot of traditional programming concepts while keeping the ability to write very fast yet safe code. Some level of control is sacrificed in the process, but this means that you do not need to worry too much about details.
Our first program prints a message; it is a tradition to print "Hello world" as
a first example in all programming manuals. To set things up,
download the smol executable
from the language's latest release.
Place it alongside the std/ directory and add both the containing folder
and a C/C++ compiler (e.g., GCC) to your system PATH. Create a file named main.s
with the text below. Finally, open a terminal in the same folder and run
smol main.s.
// main.s
@include std.core
service main()
print("Hello world!")
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 26ms compile --back gcc --runtime std/runtime/auto.h 139ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded Hello world!
A bunch of messages appeared! So let's go through them in order.
First, smoΞ» is a compiled language, meaning that it creates executable programs containing machine instructions. To this end, it first parses your program and transforms it to a simpler form during the codegen phrase. Then, that representation is turned into binary machine code using an external backend. Turning programs into code is broadly known as compiling them. Finally, the generated executable runs.
The last two bullet points come from running the program. They are printed by the language's default runtime, that is, the instructions embedded in the generated executable for working with the operating system. The default runtime prints a link to the language's repository so that you can report bugs (and add a star!) and then automatically selects whether the application should be single-threaded or multi-threaded.
A quick preview on the source code of our first program: //
turns the remainder of the line into a comment that is ignored.
@include std.core
adds basic functionality, like print. Then
service main() is the actual program.
Why it's called a service is a mystery that will be addressed later.
The contents of the service need to be intended one tab (or four spaces) to the right.
--back [compiler] The compiler backend that is used to compile an intermediate
C code representation produced by smoΞ». Default is
the highly robust gcc, but for example you may want to use another compiler installed in
your system, or something like tcc (tinycc) for very fast compilation during prototyping.
You can also use a C++ compiler to allow unsafe injection of code from that language too; everything
has been configured to work with C99 or later, as well as C++11 or later.--runtime [name] Determines how the compilation outcome will make
use of the target platform's capabilities. This may change, for example, the memory allocation
strategy for embedded devices, some of which require custom implementations of heap allocation
or require custom management of one huge preallocated memory segment. Such changes are controlled
via runtime files, which are then picked by the standard library or other smoΞ» code.
Another affected characteristic is whether services are treated as parallel co-routines or
eagerly executed. The runtime's name path to a .h file or the name of such a file
in the std/runtime/ directory. Default name is auto, corresponding to
std/runtime/auto.h that chooses between an eager and co-routine implementation
of services depending on their number. There are two more runtimes provided out-of-the-box,
threads that contains a co-routine implementation of services and eager
that contains an eager calling of services.--task [name] Controls when the compiler actually does with the input code.
The default task is run, which produces and runs an executable. You can set the following options:
--workers [number] The number of threads that can be involved in
the type system resolution. This only affects compilation. Default is a single worker.
A variable is a named box that holds a value. You give it a name, then
put a value in it with the pattern variable_name = value.
In programming, we say that you assign a value to a variable.
Names cannot start with numbers or contain
spaces or special symbols other than the underscore _.
They can also not be existing variables or other operations.
In smoΞ», two consecutive underscores are not allowed either, because the language
uses the combination for some internal workings. Some valid variable names are x, employee,
my_property, _temp_computation, MyDataStructure, var123.
Below is an example where we set a constant
text known during program creation (cstr)
to a variable.
// main.s
@include std.core
service main()
greeting = "Hello world!"
print(greeting)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 26ms compile --back gcc --runtime std/runtime/auto.h 139ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded Hello world!
Numbers: Numbers are values, too. There are three kinds you will use often:
u64 β whole numbers without a sign (0, 1, 2, ...). The default when writing 2 or 42.i64 β whole numbers with a sign (-1, +0, +1, ...) obtained by transforming u64 values.f64 β numbers with a decimal point. The default when writing 2.0 or 3.14.These are known as unsigned integers, signed integers, and float numbers, respectively.
Notice that the type mnemonics combine the first letters of those types with the number 64.
The is to let experienced programmers know that 64 bits are used to represent the numbers under-the-hood
(there is some historically baggage concerning the C language on why programmers would not easily trust
us if we did not explicitly promise a number).
Given that many bits, unsigned integers can represent numbers 0 upto 2^65-1
and signed ones can represent numbers -2^64 upto 2^64-1.
Floats follow the IEEE 754 standard, which is typically accurate to 15-17 significant digits.
For safety, you cannot mix operations between different types of numbers. For example, you cannot subtract
a float from 0 but only from 0.0.
However, you can convert between number formats with value.type().
You would be surprised how many bugs are prevented by requiring only compatible numbers. Below are some examples of numeric operations.
is_inf or is_nan
that can be made accessible with @include std.math.
Lack of native error checking from the standard results in performant code.
@include std.core
service main()
print(1+2)
print(2/3) // unsigned integer division
print(2.0/3.0) // float division
minus_one = 0.i64()-1.i64()
print(minus_one)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 24ms compile --back gcc --runtime std/runtime/auto.h 69ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 3 0 0.666667 -1
Mutable variables: After assigning to a variable, its value cannot normally change. To allow changes, declare it as mutable
by placing @mut before its first declaration/assignment.
Variables are immutable -that is, not mutable- by default to avoid many logic bugs. Always look out
for what might have changed if something is mutable.
// main.s
@include std.core
service main()
@mut name = "Ada"
print(name)
name = "Lovelace" // allowed because we marked it mutable before
print(name)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 24ms compile --back gcc --runtime std/runtime/auto.h 69ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded Ada Lovelace
Sometimes we only want certain lines to run if a tested condition is true.
This is done with an if block.
The word if starts the block, then comes a condition,
then the code that should run if the test passes. Blocks are intended one more tab
(or four more spaces) to the right.
// main.s
@include std.core
service main()
if true
print("this always runs")
print("still in the if block")
print("back in main now")
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 24ms compile --back gcc --runtime std/runtime/auto.h 69ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded this always runs still in the if block back in main now
@serial. This ignores any indentation,
and makes code blocks end in either return value
that stops the function by returning a value (this will be covered below)
or with then final_expression to run
the block's last expression. This mode can be used for serializing programs with minimal
storage overhead.
Notably, then can be skipped when it can be inferred;
typically use it for the last branch of conditions and in loops.
Booleans:
Above, the tested condition is just the value true, so the message will always be printed.
If you changed the condition to false, the inside would be skipped. These two values
(true and false) are
known as boolean ones, or bool for short.
More often, the test contains numerical or other comparisons that evaluate to a boolean value.
For example, 2 < 3 checks whether two is less than three.
// main.s
@include std.core
service main()
if 2<3
print("yes, two is smaller")
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 24ms compile --back gcc --runtime std/runtime/auto.h 69ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded yes, two is smaller
There are several comparison operators you can use - some of these are defined for data other than numbers too:
== equal to!= not equal to< less than<= less than or equal> greater than>= greater than or equal
We can also use elif (else if)
and else branches to cover alternatives.
Each branch is tried in order, until one runs. The rest are skipped. Here is an example:
@include std.core
service main()
x = 5
if x>0
print("positive")
elif x<0
print("negative")
else
print("zero")
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 24ms compile --back gcc --runtime std/runtime/auto.h 69ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded positive
A loop repeats a block while a condition is true. Syntactically, it
starts with while followed by a condition
and the block's contents. Similarly to conditions, loop contents reside a further
indented code block.
If a variable changes inside the loop, is needs to be mutable during its first assignment.
Otherwise, the language would complain.
// main.s
@include std.core
service main()
@mut i = 0
while i<5
print(i)
i = i+1
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 24ms compile --back gcc --runtime std/runtime/auto.h 69ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 0 1 2 3 4
The next snippet shows a pattern for looping through a range of unsigned integers 0 upto 4.
You could skip the 0, argument for further simplicity. More details will be fully presented later, but it
would be remiss to not mention this pattern here, as it prevents accidental bugs. Broadly, the next function
progresses the range while tracking values by assigning them to mutable variable i.
It also returns a boolean value on whether the loop should continue. The mutable variable is updated in every loop.
Notice the . before the while, which is how
range is transferred to next. With this pattern, you do not need to manually handle the increment, which
you might forget about or could be complicated. Similar patterns let you, for example, automate the process of reading
from files.
// main.s
@include std.core
service main()
range(0, 5).while next(@mut u64 i)
print(i)
Sometimes, you may want to conditionally assign a value to
a variable. If the variable is mutable, you can place a different assignment within
each branch, though this requires the mutability -which is not as safe as immutable assignments-
and a previously set default value. As an alternative, smoΞ» provides
an algorithm control flow that starts a code block and
captures values obtained from internal statements of the form return value.
Returns immediately end the code block, even if they occur within internally defined blocks.
Below is an example, which evaluates to several possibilities.
All returns must provide the same type of values. Even if
there are nested conditions, all returns yield back a value to the algorithm.
@include std.core
service main()
x = 0.0-2.0
sign = algorithm
if x>0.0
return "positive"
if x<0.0
return "negative"
return "zero"
print(sign)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 24ms compile --back gcc --runtime std/runtime/auto.h 69ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded negative
Breaking away from loops:
The algorithm structure can be used as a means from
breaking away from loops. Below is an example, where some commands are merged in the same line
for conciseness. The same types should be yielded by all return
values. Do note that the ok value provided by std.core,
matches no returns. As a side-note, the "algorithm" keyword is deliberately long to be
easy to spot and match with subsequent returns.
// main.s
@include std.core
service main()
@mut i = 0
limit = 5
algorithm
while true
print(i)
if i==limit
return ok
i = i+1
print("ended while")
print(i)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 24ms compile --back gcc --runtime std/runtime/auto.h 80ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 0 1 2 3 4 5 ended while 5
You can name a block of code and call it with inputs to obtain none, one, or multiple outputs. The named block is called a function and its inputs are called arguments. There are two kinds of functions:
def β A function with no calling cost. Delegates error and resource handling to its caller.service β Safely handles errors and resources, including resource freeing on failure.In simple programs, you will mostly declare def functions
and let them freely fail. SmoΞ»'s philosophy is to not try to hopelessly recover from every failure state,
but exit gracefully a bunch of dependent computations and try again. This is to strike a balance between error handling
code that pollutes the codebase and recovering from impactful failures. Services are more complex in that they run
independently -and potentially asynchronously- to each other.
Regardless of the type of function, you can declare arguments
as comma-separated variable types and names (each type corresponds to a name, separated by space).
Types are needed so that the service can know what inputs to expect.
For example, f64 x denotes an argument that is a float named x.
Arguments may also be nameless (consist of only the type), but more on this later.
There may be some additional notation before types too. This is described below.
// main.s
@include std.core
def affine(f64 x, f64 y, f64 z)
return (x+y)*z
service main()
result = affine(1.0, 2.0, 3.0)
print(result)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 9.000000
Mutability:
Inputs are passed "by value" (without affecting the call site) unless you explicitly allow changes.
Place @mut before argument names to declare that
the variable passed as an argument may be modified inside the function - and hence
must already be mutable. This also makes the argument variable internally mutable.
Importantly, services do not accept mutable arguments. The pattern there is to have one service
control the creation process of data, and share the outcome with other services.
Immutability by default is a contract of function inputs (not outputs); functions declare that
they will never alter values, but what they return can be assigned to mutable variables.
Below is an example.
@include std.core
def increment(@mut u64 x)
x = x + 1
service main()
@mut n = 10
increment(n)
print(n)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 11.000000
Similarly to mutable variables declared within blocks of code,
mutable arguments make prospective changes happen easy to spot.
Conversely, if you do not see @mut,
nothing changes.
Functions as arguments: You can use functions as arguments to help disambiguate between similarly-named alternatives. In that case, simply skip the variable name. This is useful for choosing a behavior without passing a dummy value.
@include std.core
def zero(f64)
return 0.0
def zero(u64)
return 0
service main()
a = zero(f64)
b = zero(u64)
print(a)
print(b)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 0.000000 0
Currying:
The dot notation first_argument.function_call(other_arguments)
sends the value on the left as the first argument of a function.
This reads left-to-right and can be chained. Here is an example:
// main.s
@include std.core
def triple(f64 x)
return x*3.0
service main()
2
.f64()
.triple()
.print()
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 6.000000
The same colon also works with loops provided by the standard library (like range) so you can write readable iterations.
Reduction:
It is often desire-able to iteratively apply a function to several values. This concept is known as reduction,
and is applied with a pattern like add(@all 1 2 (2+1)*2). To understand how this works,
first we need to understand that each expression within add will be evaluated separately, thus resulting
to the evaluation of add(@all 1 2 6).
Then, the called function is applied to the first
arguments and replaces them with the result, yielding an under-the-hood representation
add(@all 3 6). This process is repeated until eventually all arguments are used and
9 is yielded. Here is an example that also showcases usage of printin to print multiple strings
in the same line:
// main.s
@include std.core
service main()
@on Heap.dynamic()
print(add(@all 1 2 3))
name = "there"
printin(@all "Hi "name"!\n") // does not use any memory
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 6 Hi there!
You can return from a function early per return value. This works given that there is no
algorithm environment capturing the returned value.
// main.s
@include std.core
def abs(f64 x)
if x<0.0
return 0.0-x
return x
service main()
x = 0.0-1.0
print(abs(x))
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 1.000000
You can return several named values at once and then access them by their variable's name with a
dot notation. For example, if the outcome returning with return x,y
is stored into a variable p, individual values can be accessed via p.x or p.y.
But p still represents the sequence of values x,y.
This is visually different from currying in that there is no function call parenthesis. But returned values would not
be retrievable from return x+1,y+1, as the additions have not been stored
in a named variable. This is also fine, and you may do it for convenience in some scenarios.
Finally, @args is a shorthand that returns all inputs if placed
at the beginning of a return statement. Below is an example.
\\ main.s
@include std.core
def Point(f64 x, f64 y)
return @args
def moved(Point p, f64 dx, f64 dy)
nx = p.x + dx
ny = p.y + dy
return Point(nx, ny)
service main()
@mut p = Point(1.0, 2.0)
print(p.x)
print(p.y)
p = p.moved(3.0, 4.0)
print(p.x)
print(p.y)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 1.000000 2.000000 4.000000 6.000000
If a function returns only one value, use that directly without an extra field name.
To help you write secure code that does not arbitrarily access mutable fields, smoΞ» normally
prevents you from accessing or setting them. To be able to do so, add the @access
notation at the very start of respective arguments. Below is an example where, if one removed the notation, even immutable versions
of a type would not allow viewing certain fields.
// main.s
@include std.core
def Point(f64 _x, f64 _y)
@mut x = _x
@mut y = _y
return x,y
def print(@access Point p)
// p is not mutable but you still need `@access`
// to look at p.x, p.y because their original
// declaration was mutable
printin(@all p.x "," p.y "\n")
service main()
@mut p = Point(1.0, 2.0)
// print(p.x) // not allowed
print(p)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 1.000000,2.000000
Now is the right time to talk about services and their philosophy: they are basically
functions that run in parallel threads
(in truth: with a co-routine model but this is too advanced
for this tutorial) and, when something is wrong -including in one of their called functions-,
they complain and stop their work. When they stop, they also safely release any resources so that
you do not have memory leaks or other similar issues.
Anywhere, call fail("message") to stop the current service.
The caller can check result.err.bool().
If you donβt check but try to use a value, the error will bubble up until it reaches a place that does.
Your code waits for computations to conclude by called services only when values are used.
// main.s
@include std.core
service divide(f64 x, f64 y)
if y==0.0
fail("Division by zero")
return x/y
service main()
r = divide(1.0, 0.0)
if r.err.bool()
print("Could not compute.")
else
print(r)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded Division by zero Could not compute.
This approach keeps the happy path simple; you try running the service and, if it fails, decide what the next step is (ask again, use a default, stop).
Sometimes we want to treat several different types as if they were βthe same kind of thing.β
For example, an integer, a float, and a signed integer are all numbers with similar operations defined.
In other words, we can write code that reads largely the same for all three types of numbers.
In smoΞ», you can avoid repeating that code by defining a union.
The latter groups together several types under a single name and you can define
functions that work with the union instead of each type separately. Its definition looks like
an assignment, where the right-hand-side types are separated by or.
union Number = f64 or u64 or i64
This definition means that Number can represent
u64,
f64, or i64.
Once the union is defined, you can write
functions that take a Number and automatically work with whichever
form it has. For example, the standard library has printing and arithmetic operator overloads:
so you can call print on a Number and it βjust works.β
Unions are lexically scoped within functions; the same name always represents the same type.
Thus, in the example below both numbers must be of the same type.
But you can add a union as part of another one to transfer all types under a different name.
// main.s
@include std.core // defines Number
def add1mul(Number a, Number b)
// convert 1 to the Number correct primitive
one = 1.Number()
return (a+one)*(b+one)
service main()
print(add1mul(1,2))
print(add1mul(1.0, 2.0))
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 6 6.000000
Unions are secure as, behind the scenes, functions check if they can work for all variations of union members to report errors early.
Unions let use overload function names (like print) by dynamically
adapting them on the type. However, only the declared member types are allowed.
They further quickly create zero-cost overloaded variations of functions with the same code.
Think of unions as a way to say:
βI donβt care if this is an i64 or an f64 β
as long as itβs a Number, I can use the overloaded operations
to write the same code conceptually.β
In most languages, functions call freely each other, or even themselves. This is useful for quickly expressing complex programs but comes at a cost: every call stacks on top of the previous one. At worst, you can create a circular recursion you might run out of memory for tracking call inputs and returns (a βstack overflowβ). In an effort to protect against such unconstrained behavior, smoΞ» functions can only look at previously declared or imported code.
To still enable similar computations where functions call each other arbitrarily, the language uses a trick called trampolining. Think of it like a ball bouncing back and forth between two or more functions: each one decides what the next step is, and we loop until the process finishes. The trick is that you, the programmer, is the one keeping track of the ball. Therefore you can notice immediately if you are doing too complex stuff. This way, recursion does not pile up memory; it just reuses a single loop.
Below is an example that demonstrates how to do this using a tag primitive that
we have not addressed so far. Values of that type are created by writing
@tag name. Obtaining a tag this way is consider the highest-priority operator.
Tags behave like a number with a given value, but can
be later resolved into function calls with the @dynamic operator exemplified below.
In the example, a tag primitive type lets us reference
@tag ping or @tag pong within our tracked state.
Importantly the state is defined before either of those functions are defined.
Then, the @dynamic instruction chooses to call a function
among a list of specified alternatives (ping or pong here) depending on the next tag value. For example, one could call ping per the over-engineered pattern:
p = @tag ping
@dynamic(ping) p(n, "ping").
Additional values packed into p, as in the example, serve as arguments. So, when the example calls
@dynamic(ping,pong) pending("next - ") it resolves to either:
ping(pending.n, "next - ") or pong(pending.n, "next - "),
depending on the value of pending.func.
If something unexpected happens, such as another tag occurring, the current service fails safely, without memory leaks or crashes.
Type safety is still checked during compilation.
// main.s
@include std.core
def state(tag func, u64 n)
return @args
def ping(u64 n, cstr message)
printin(message)
print("ping")
return @tag pong.state(n)
def pong(u64 n, cstr message)
if n == 0
return @tag done.state(u64)
printin(message)
print("pong")
return @tag ping.state(n-1)
service main()
@mut pending = @tag pong.state(2)
while @tag done!=pending.func
pending = @dynamic(ping,pong) pending("next - ")
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 22ms compile --back gcc --runtime std/runtime/auto.h 66ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded next - pong next - ping next - pong next - ping
To sum up, in the example above ping and pong call each other back and forth.
Instead of diving deeper into the stack, they just return a next state marked with a tag
(:ping, :pong, or :done).
The main loop keeps track of which step is next, and runs it until it reaches the :done tag.
So whenever you would normally write a recursive function, you can instead write it
as a loop with tags. It feels a bit like building a tiny βinterpreterβ
for your recursive logic that is safe and efficient, and performant. Do note that
@dynamic may inject code from all functions it
handles, so do not overuse it to avoid creating exceptionally large executables.
// main.s
@include std.core
def fib_state(tag func, u64 n, u64 a, u64 b)
return @args
def fib_start(u64 n)
// initialize Fibonacci sequence with first two numbers
return @tag fib_step.fib_state(n, 0, 1)
def fib_step(u64 n, u64 a, u64 b)
if n == 0
// next is equivalent to return fib_state(@tag done, u64, a, b)
return @tag done.fib_state(u64, a, b)
return @tag fib_step.fib_state(n-1, b, a+b)
service main()
n = 10
@mut pending = :fib_start.lambda(n)
while @tag done != pending.func
pending = @dynamic(fib_start, fib_step) pending()
// when done, print result (b is the n-th Fibonacci number)
print("Fibonacci result:")
print(pending.b)
When writing code that leverages union data types, you will often find yourself at a point where some kind of specialization or edge case should be handled different for some alternative. For example, there may be a non-copying -and hence faster- implementation. As an alternative to rewriting the same code with minor differences each time, smoΞ» lets you conditionally compile segments of a function. Related choices are handled during compilation and do not incur any runtime overhead. (By contrast, dynamic calls add actual checks during runtime.)
Compile a function only for certain argument types: One might think that this could be as simple as
declaring proper arguments, but more than one combinations could be allowed and checked with a type comparison
only later on. In this case, use the syntax case ... qed, where type
errors of any enclosed code just prevent the function variation from being compiled instead of creating an
actual compilation error. Below is an example from the standard library, which transforms the argument _k
to one of standardized type before making a check against the kind of data a hash indexer.
This lets the find function operate only on hashable data types that are compared only to the type of data
handled by the hash index.
def find(@access @mut Hash self, Hashable _k, @mut u64 idx)
k = _k.to_hash_base()
case k.is(Hash.to_hash_base()) qed // reject different types between k and hash base
if k.is_zero()
return true
pos = hash(k, self.size)
range(0, self.size)
.while next(@mut u64 i)
idx = (pos+i).mod(self.size)
if idx.bool() and self.entries[idx].is_zero().not() and self.entries[idx]==k
return true
return false
Retrieve union resolution: You can also retrieve the exact type a union has resolved to within a types's arguments.
Do this by employing the :: operator, as in the
example below. Function argument types are lexically scoped, so NumberType
remains the same between _x,_y and within the function body of Point.
Thus, Point::FunctionBody retrieves the kind of number
currently in use by the point's data structure, so as to convert 1 to the respective format.
There are three variation of this structure, depending
on the type, and thus three variations of inc too.
// main.s
@include std.core
union NumberType = i64 or u64 or f64 // replicates std.core::Number for completeness
def Point(NumberType _x, NumberType _y)
@mut x = _x
@mut y = _y
return x,y
def inc(@access @mut Point p)
p.x = p.x+Point::NumberType(1)
p.y = p.y+Point::NumberType(1)
service main()
value = 1.f64()
@access p = Point(value, value)
p.inc()
print(p.x)
At this point, you may be wondering how could smoΞ» run operations that consume an indeterminate amount of memory. For example, you might want to combine strings or manage lists of arbitrary sizes.
This is where buffers and
@include std.mem come into play.
We start from touching on concepts of the latter, and explain about
buffer later. Broadly, there are two memory
"devices" exposed by smoΞ»;
the Stack and Heap. These correspond to the small but fast
memory your operating system uses to run your program, and the full extend of available memory.
Those devices provide memory allocation capabilities, but also grant you the ability to allocate more
memory either dynamically as needed or as very fast arenas.
Memory follows the same pattern as
all resources that smoΞ» uses; it is automatically freed when not in use anymore.
This is achieved with some minor restrictions that the compiler will let you know occasionally. For example,
you can declare memory (for example to use with @on
contexts) only outside of loops that would leak the memory in each iteration.
The benefit is that there is no running overhead or bottleneck for testing when memory is no longer needed and freeing it. Memory is released back to the operating system at function ends without unforeseen runtime checks. That is, memory management is a zero-cost abstraction.
Arenas: Arenas are pre-allocated memory regions (either on the heap or the stack) that cannot grow. They are typically made to consume a safe amount of memory and ensure that programs can successfully run. This often comes with some wastage if you cannot anticipate their size. New allocations on arenas are lightweight (they are simple memory address additions) and therefore execute quickly. In smoΞ», arenas that run out of space cause services to fail, which is always done safely and without leaking resources.
The next example shows how to work with arenas. The @on
statement automatically adds the arena as a first argument to all operations inside the current code block.
// main.s
@include std.core
@include std.mem
@include std.vec
service main()
@mut memory = Stack.allocate(1.MB()).arena() // converts allocation of 1MB to arena
@on memory // automatically apply as first argument
v = vector(5) // calls vector(memory,5)
print(v.len())
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 5
If you switch to Heap instead of Stack, the same code
would allocate on random access memory. The details of how heap
memory is managed may be modified for different systems by passing
a different runtime to the compiler.
In the above snippet, mutability is needed to allow modification of memory contents. However, temporary values thar have not yet been assigned to any symbol are mutable. You can also return a value from that block, therefore making the following equivalent syntax possible. Remember that the arena's consumed memory is released only when the vector is no longer needed.
//main.s
@include std.core
@include std.mem
@include std.vec
service main()
@on Stack.allocate(1024).arena()
v = vector(5)
@on ok // empty @on context for safety
print(v.len())
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 5
Dynamic memory:
You can also choose to maintain a collection of dynamically allocated memory segments,
for example by calling @on Heap.dynamic() instead
of creating an arena. If multiple of those segments are attached to the same dynamic
allocation, they are also released together, which means that they remain allocated as
long as any is in use. However, you can also make a new allocation each time you need some
dynamic memory.
SmoΞ»'s standard library provides three main string types:
cstr (a constant string enclosed in quotations during compilation, null-terminated),
nstr (null-terminated with length information), and
str (a string segment). For most purposes, you can convert the other two to
str. That said, null-termination is required for the
syscalls of most operating systems, for example in std.os.
Most operations on strings are zero-cost abstractions over simple arithmetics, as happens
for example when retrieving substrings. The most heavyweight operations are copying and
concatenation, both of which require an arena or dynamic memory to store the result. Below
is an example, where it is important to note that concatenation yields a nstr
by default, which can be directly reduced to a simpler str if needed.
Similarly, conversion of cstr to the other types is lightweight.
// main.s
@include std.core
@include std.mem
def Segment(new, str value)
// return all inputs
// function returns are tuples of named elements
return @args
def Segment(String _value)
// convert from many string types
value = _value.str()
return new.Segment(value)
def combine(Segment[] segments)
@mut combined = "".str() // mutable string with known size
@on Stack.allocate(1024).arena() // automatically use as argument if needed (for string operations)
segments.len().range()
.while next(@mut u64 i)
combined = add(@all combined segments[i].value " ").str()
return combined
service main()
segments = Segment[] // buffer
.push("I think.".Segment())
.push("Therefore I am.".Segment())
segments.combine().print()
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded I think. Therefore I am.
Buffers represent collections of data that grow dynamically.
The simplest syntax is type[], which produces
a resizable array of that type. This is only allowed if the type is a new one.
You can grow buffers or modify their elements only as long as they remain mutable. For example,
you can create a buffer per v = u64[].push(1).push(2)
that is henceforth immutable.
// main.s
@include std.core
def data(u64 id, u64[] values)
return @args
service main()
@mut vals = u64[]
.push(1)
.push(2)
p = data[].push(data(10, vals))
print(p[0].id)
print(p[0].values[0])
print(vals[0])
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 10 1 1
Custom buffer allocation: By default, buffers use dynamically sized Heap memory to store their data. However, you can
also create them on memory allocations provided by the standard library; all types of memory
accept memory.allocate(size) for setting aside a fixed-size region, within
which buffers can grow. Below is an example. Do note that data can be pushed onto buffers during
their creation, while they are temporary variables -and hence mutable- and the result can be
assigned to an immutable variable.
// main.s
@include std.core
service main()
vals = u64[Heap.allocate(1024)] // 1024 max buffer size
.push(1)
.push(2)
print(vals[0])
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 1
Buffers preserve mutability rules: only if a buffer itself is mutable can its contents be changed or extended. That said, you can assign immutable buffers to mutable ones and alter the contents of the latter. Strings can also be stored in buffers or arrays given that they reside in the same memory surface. Below is an example.
// main.s
@include std.core
@include std.mem
service main()
boxes = str[]
.push("buffer start".str())
.push("buffer end".str())
b = boxes[0]
print(b)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded buffer start
Here we pushed two strings into a string array.
Notice the explicit conversion to the correct string version using .str().
Buffers can be returned from services normally. In this case, their
memory is released by the returning service.
In the example below, samples constructs and returns a buffer.
The caller can then access its elements with buf[index].
@include std.core
service samples()
buf = u64[]
.push(42)
.push(10)
return buf
service main()
buf = samples()
print(buf[0])
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 42
You can share the same data between multiple buffers, even if those are mutable.
This can allow one to make changes to another. You can even use an allocated
memory region to declare a char[] buffer
and then use the same region to allocate strings. Safety is preserved in that allocated
memory regions accept only the same kinds of data; the compiler shows an error if
contract violations are found about how data are generated.
As a result, replaced data may be modified but they will always be safe to read
and write. Below is an example:
@include std.core
@include std.mem
service main()
@mut memory = Heap.allocate(1.KB())
@mut buf = char[memory]
.push("H".str().first)
.push("i".str().first)
.push("!".str().first)
s = memory.str(3)
print(s)
// now modify the buffer - it modifies the string
buf[2] = "#".str().first
print(s)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded Hi! Hi#
// main.s
@include std.core
@include std.mem
@include std.file :: File
def file_stats(
new,
u64 lines,
u64 chars
)
return @args
def print(file_stats stats)
printin(@all stats.lines " lines, " stats.chars " bytes\n")
def file_reader(
String path,
@mut ContiguousMemory memory
)
@mut stat_lines = 0
@mut stat_chars = 0
@mut file = ReadFile.open(path)
endl = "\n".str().first
@on memory.arena()
while file.next_line(@mut str line)
stat_lines = stat_lines + 1
stat_chars = stat_chars + line.len()
printin(@all "| " line "\n")
fstats = new.file_stats(stat_lines, stat_chars)
return fstats, memory
service main()
@mut memory = Stack.allocate(1.MB())
@access @mut stats = file_reader("README.md", memory)
print(stats.fstats)
Sometimes, you may want to preemptively release resources, such as memory or open files. Releasing is interpreted differently, depending on the resource type, but in general is an indicator that whatever is associated with the variable should stop encumbering the computer as soon as possible. Recall that resources no longer in use are released at the end of functions or services.
In general, you can employ the syntax
@release variable to release all
resources associated with the variable. The compiler will create an
error if you accidentally reuse one of those resources.
Beware that resources could be shared across unforeseen places and could
thus require patching up your code to enable a release of only one of them.
For example, if you store multiple strings on the same arena releasing
one of the strings invalidates the others too.
Below is an example that opens a new console terminal, and uses
@release to pre-emptively close it.
// main.s
@include std.core
@include std.file
@include std.mem
service main()
// 2 bytes of memory to read a null-terminated character from the console
@mut keyreader = Stack.allocate(2).circular()
@mut cons = WriteFile.console()
cons.print("hello world!\n")
printin("Press enter here to close the open console...")
keyreader.read()
@release cons // safely close the resource preemptively
print("We are done. Press enter again...")
keyreader.read()