Let's take our first steps into programming with smoΞ» (pronounced like "small" but with "o" instead of "a"). The language simplifies a lot of traditional programming concepts while keeping the ability to write very fast yet safe code. Some level of control is sacrificed in the process, but this means that you do not need to worry too much about details.
Our first program prints a message; it is a tradition to print "Hello world" as
a first example in all programming manuals. To set things up,
download the smol executable
from the language's latest release.
Place it alongside the std/ directory and add both the containing folder
and a C/C++ compiler (e.g., GCC) to your system PATH. Create a file named main.s
with the text below. Finally, open a terminal in the same folder and run
smol main.s.
// main.s
@include std.core
service main()
print("Hello world!")
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 26ms compile --back gcc --runtime std/runtime/auto.h 139ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded Hello world!
A bunch of messages appeared! So let's go through them in order.
First, smoΞ» is a compiled language, meaning that it creates executable programs containing machine instructions. To this end, it first parses your program and transforms it to a simpler form during the codegen phrase. Then, that representation is turned into binary machine code using an external backend. Turning programs into code is broadly known as compiling them. Finally, the generated executable runs.
The last two bullet points come from running the program. They are printed by the language's default runtime, that is, the instructions embedded in the generated executable for working with the operating system. The default runtime prints a link to the language's repository so that you can report bugs (and add a star!) and then automatically selects whether the application should be single-threaded or multi-threaded.
A quick preview on the source code of our first program: @include std.core
adds basic functionality, like print. Then
service main() is the actual program.
Why it's called a service is a mystery that will be addressed later.
The contents of the service need to be intended one tab (or four spaces) to the right.
--back [compiler] The compiler backend that is used to compile an intermediate
C code representation produced by smoΞ». Default is
the highly robust gcc, but for example you may want to use another compiler installed in
your system, or something like tcc (tinycc) for very fast compilation during prototyping.
You can also use a C++ compiler to allow unsafe injection of code from that language too; everything
has been configured to work with C99 or later, as well as C++11 or later.--runtime [name] Determines how the compilation outcome will make
use of the target platform's capabilities. This may change, for example, the memory allocation
strategy for embedded devices, some of which require custom implementations of heap allocation
or require custom management of one huge preallocated memory segment. Such changes are controlled
via runtime files, which are then picked by the standard library or other smoΞ» code.
Another affected characteristic is whether services are treated as parallel co-routines or
eagerly executed. The runtime's name path to a .h file or the name of such a file
in the std/runtime/ directory. Default name is auto, corresponding to
std/runtime/auto.h that chooses between an eager and co-routine implementation
of services depending on their number. There are two more runtimes provided out-of-the-box,
threads that contains a co-routine implementation of services and eager
that contains an eager calling of services.--task [name] Controls when the compiler actually does with the input code.
The default task is run, which produces and runs an executable. You can set the following options:
--workers [number] The number of threads that can be involved in
the type system resolution. This only affects compilation. Default is a single worker.
A variable is a named box that holds a value. You give it a name, then
put a value in it with the pattern variable_name = value.
In programming, we say that you assign a value to a variable.
Names cannot start with numbers or contain
spaces or special symbols other than the underscore _.
They can also not be existing variables or other operations.
In smoΞ», two consecutive underscores are not allowed either, because the language
uses the combination for some internal workings. Some valid variable names are x, employee,
my_property, _temp_computation, MyDataStructure, var123.
Below is an example where we set a constant
text known during program creation (cstr)
to a variable.
// main.s
@include std.core
service main()
greeting = "Hello world!"
print(greeting)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 26ms compile --back gcc --runtime std/runtime/auto.h 139ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded Hello world!
Numbers: Numbers are values, too. There are three kinds you will use often:
u64 β whole numbers without a sign (0, 1, 2, ...). The default when writing 2 or 42.i64 β whole numbers with a sign (-1, +0, +1, ...) obtained by transforming u64 values.f64 β numbers with a decimal point. The default when writing 2.0 or 3.14.These are known as unsigned integers, signed integers, and float numbers, respectively.
Notice that the type mnemonics combine the first letters of those types with the number 64.
The is to let experienced programmers know that 64 bits are used to represent the numbers under-the-hood
(there is some historically baggage concerning the C language on why programmers would not easily trust
us if we did not explicitly promise a number).
Given that many bits, unsigned integers can represent numbers 0 upto 2^65-1
and signed ones can represent numbers -2^64 upto 2^64-1.
Floats follow the IEEE 754 standard, which is typically accurate to 15-17 significant digits.
For safety, you cannot mix operations between different types of numbers. For example, you cannot subtract
a float from 0 but only from 0.0.
However, you can convert between number formats with value.type().
You would be surprised how many bugs are prevented by requiring only compatible numbers. Below are some examples of numeric operations.
is_inf or is_nan
that can be made accessible with @include std.math.
Lack of native error checking from the standard results in performant code.
@include std.core
service main()
print(1+2)
print(2/3) // unsigned integer division
print(2.0/3.0) // float division
minus_one = 0.i64()-1.i64()
print(minus_one)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 24ms compile --back gcc --runtime std/runtime/auto.h 69ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 3 0 0.666667 -1
Mutable variables: After assigning to a variable, its value cannot normally change. To allow changes, declare it as mutable
by placing @mut before its first declaration/assignment.
Variables are immutable -that is, not mutable- by default to avoid many logic bugs. Always look out
for what might have changed if something is mutable.
// main.s
@include std.core
service main()
@mut name = "Ada"
print(name)
name = "Lovelace" // allowed because we marked it mutable before
print(name)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 24ms compile --back gcc --runtime std/runtime/auto.h 69ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded Ada Lovelace
Sometimes we only want certain lines to run if a tested condition is true.
This is done with an if block.
The word if starts the block, then comes a condition,
then the code that should run if the test passes. Blocks are intended one more tab
(or four more spaces) to the right.
// main.s
@include std.core
service main()
if true
print("this always runs")
print("still in the if block")
print("back in main now")
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 24ms compile --back gcc --runtime std/runtime/auto.h 69ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded this always runs still in the if block back in main now
@serial. This ignores any indentation,
and makes code blocks end in either return value
that stops the function by returning a value (this will be covered below)
or with then final_expression to run
the block's last expression. This mode can be used for serializing programs with minimal
storage overhead.
Notably, then can be skipped when it can be inferred;
typically use it for the last branch of conditions and in loops.
Booleans:
Above, the tested condition is just the value true, so the message will always be printed.
If you changed the condition to false, the inside would be skipped. These two values
(true and false) are
known as boolean ones, or bool for short.
More often, the test contains numerical or other comparisons that evaluate to a boolean value.
For example, 2 < 3 checks whether two is less than three.
// main.s
@include std.core
service main()
if 2<3
print("yes, two is smaller")
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 24ms compile --back gcc --runtime std/runtime/auto.h 69ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded yes, two is smaller
There are several comparison operators you can use - some of these are defined for data other than numbers too:
== equal to!= not equal to< less than<= less than or equal> greater than>= greater than or equal
We can also use elif (else if)
and else branches to cover alternatives.
Each branch is tried in order, until one runs. The rest are skipped. Here is an example:
@include std.core
service main()
x = 5
if x>0
print("positive")
elif x<0
print("negative")
else
print("zero")
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 24ms compile --back gcc --runtime std/runtime/auto.h 69ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded positive
A loop repeats a block while a condition is true. Syntactically, it
starts with while followed by a condition
and the block's contents. Similarly to conditions, loop contents reside a further
indented code block.
If a variable changes inside the loop, is needs to be mutable during its first assignment.
Otherwise, the language would complain.
// main.s
@include std.core
service main()
@mut i = 0
while i<5
print(i)
i = i+1
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 24ms compile --back gcc --runtime std/runtime/auto.h 69ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 0 1 2 3 4
The next snippet shows a pattern for looping through a range of unsigned integers 0 upto 4.
You could skip the 0, argument for further simplicity. More details will be fully presented later, but it
would be remiss to not mention this pattern here, as it prevents accidental bugs. Broadly, the next function
progresses the range while tracking values by assigning them to mutable variable i.
It also returns a boolean value on whether the loop should continue. The mutable variable is updated in every loop.
Notice the . before the while, which is how
range is transferred to next. With this pattern, you do not need to manually handle the increment, which
you might forget about or could be complicated. Similar patterns let you, for example, automate the process of reading
from files.
// main.s
@include std.core
service main()
range(0, 5).while next(@mut u64 i)
print(i)
Sometimes, you may want to conditionally assign a value to
a variable. If the variable is mutable, you can place a different assignment within
each branch, though this requires the mutability -which is not as safe as immutable assignments-
and a previously set default value. As an alternative, smoΞ» provides
an algorithm control flow that starts a code block and
captures values obtained from internal statements of the form return value.
Returns immediately end the code block, even if they occur within internally defined blocks.
Below is an example, which evaluates to several possibilities.
All returns must provide the same type of values. Even if
there are nested conditions, all returns yield back a value to the algorithm.
@include std.core
service main()
x = 0.0-2.0
sign = algorithm
if x>0.0
return "positive"
if x<0.0
return "negative"
return "zero"
print(sign)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 24ms compile --back gcc --runtime std/runtime/auto.h 69ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded negative
Breaking away from loops:
The algorithm structure can be used as a means from
breaking away from loops. Below is an example, where some commands are merged in the same line
for conciseness. Having an explicit return statement
in all situations is necessary - the language would complain otherwise. The example returns
the ok value provided by std.core,
which has no contents. As a side-note, the "algorithm" keyword is deliberately long to be
easy to spot and match nested returns to.
// main.s
@include std.core
service main()
@mut i = 0
limit = 5
algorithm
while true
print(i)
if i==limit
return ok
i = i+1
return ok // runs if nothing else is returned
print("ended while")
print(i)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 24ms compile --back gcc --runtime std/runtime/auto.h 80ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 0 1 2 3 4 5 ended while 5
You can name a block of code and call it with inputs to obtain none, one, or multiple outputs. The named block is called a function and its inputs are called arguments. There are two kinds of functions:
def β A function with no calling cost. Delegates error and resource handling to its caller.service β Safely handles errors and resources, including resource freeing on failure.In simple programs, you will mostly declare def functions
and let them freely fail. SmoΞ»'s philosophy is to not try to hopelessly recover from every failure state,
but exit gracefully a bunch of dependent computations and try again. This is to strike a balance between error handling
code that pollutes the codebase and recovering from impactful failures. Services are more complex in that they run
independently -and potentially asynchronously- to each other.
Regardless of the type of function, you can declare arguments
as comma-separated variable types and names (each type corresponds to a name, separated by space).
Types are needed so that the service can know what inputs to expect.
For example, f64 x denotes an argument that is a float named x.
Arguments may also be nameless (consist of only the type), but more on this later.
There may be some additional notation before types too. This is described below.
// main.s
@include std.core
def affine(f64 x, f64 y, f64 z)
return (x+y)*z
service main()
result = affine(1.0, 2.0, 3.0)
print(result)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 9.000000
Mutability:
Inputs are passed "by value" (without affecting the call site) unless you explicitly allow changes.
Place @mut before argument names to declare that
the variable passed as an argument may be modified inside the function - and hence
must already be mutable. This also makes the argument variable internally mutable.
Importantly, services do not accept mutable arguments. The pattern there is to have one service
control the creation process of data, and share the outcome with other services.
Immutability by default is a contract of function inputs (not outputs); functions declare that
they will never alter values, but what they return can be assigned to mutable variables.
Below is an example.
@include std.core
def increment(@mut u64 x)
x = x + 1
service main()
@mut n = 10
increment(n)
print(n)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 11.000000
Similarly to mutable variables declared within blocks of code,
mutable arguments make prospective changes happen easy to spot.
Conversely, if you do not see @mut,
nothing changes.
Functions as arguments: You can use functions as arguments to help disambiguate between similarly-named alternatives. In that case, simply skip the variable name. This is useful for choosing a behavior without passing a dummy value.
@include std.core
def zero(f64)
return 0.0
def zero(u64)
return 0
service main()
a = zero(f64)
b = zero(u64)
print(a)
print(b)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 0.000000 0
Currying:
The dot notation first_argument.function_call(other_arguments)
sends the value on the left as the first argument of a function.
This reads left-to-right and can be chained. Here is an example:
// main.s
@include std.core
def triple(f64 x)
return x*3.0
service main()
2
.f64()
.triple()
.print()
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 6.000000
The same colon also works with loops provided by the standard library (like range) so you can write readable iterations.
Reduction:
It is often desire-able to iteratively apply a function to several values. This concept is known as reduction,
and is applied with a pattern like add(@all 1 2 1+2*2). To understand how this works,
first we need to understand that each expression within add will be evaluated separately, thus resulting
to the evaluation of add(@all 1 2 5).
Then, the called function is applied to the first
arguments and replaces them with the result, yielding an under-the-hood representation
add(@all 3 5). This process is repeated until eventually all arguments are used and
8 is yielded. Here is an example that also showcases usage of printin to print multiple strings
in the same line:
// main.s
@include std.core
service main()
@on Heap.dynamic()
print(add(@all 1 2 3))
name = "there"
printin(@all "Hi "name"!\n") // does not use any memory
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 6 Hi there!
You can return from a function early per return value. This works given that there is no
algorithm environment capturing the returned value.
// main.s
@include std.core
def abs(f64 x)
if x<0.0
return 0.0-x
return x
service main()
x = 0.0-1.0
print(abs(x))
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 1.000000
You can return several named values at once and then access them by their variable's name with a
dot notation. For example, if the outcome returning with return x,y
is stored into a variable p, individual values can be accessed via p.x or p.y.
But p still represents the sequence of values x,y.
This is visually different from currying in that there is no function call parenthesis. But returned values would not
be retrievable from return x+1,y+1, as the additions have not been stored
in a named variable. This is also fine, and you may do it for convenience in some scenarios.
Finally, @args is a shorthand that returns all inputs if placed
at the beginning of a return statement. Below is an example.
\\ main.s
@include std.core
def Point(f64 x, f64 y)
return @args
def moved(Point p, f64 dx, f64 dy)
nx = p.x + dx
ny = p.y + dy
return Point(nx, ny)
service main()
@mut p = Point(1.0, 2.0)
print(p.x)
print(p.y)
p = p.moved(3.0, 4.0)
print(p.x)
print(p.y)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 1.000000 2.000000 4.000000 6.000000
If a function returns only one value, use that directly without an extra field name.
To help you write secure code that does not arbitrarily access mutable fields, smoΞ» normally
prevents you from accessing or setting them. To be able to do so, add the @access
notation at the very start of respective arguments. Below is an example where, if one removed the notation, even immutable versions
of a type would not allow viewing certain fields.
// main.s
@include std.core
def Point(f64 _x, f64 _y)
@mut x = _x
@mut y = _y
return x,y
def print(@access Point p)
// p is not mutable but you still need `@access`
// to look at p.x, p.y because their original
// declaration was mutable
printin(x)
printin(",")
print(y)
service main()
@mut p = Point(1.0, 2.0)
// print(p.x) // not allowed
print(p)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 1.000000,2.000000
Now is the right time to talk about services and their philosophy: they are basically
functions that run in parallel threads
(in truth: with a co-routine model but this is too advanced
for this tutorial) and, when something is wrong -including in one of their called functions-,
they complain and stop their work. When they stop, they also safely release any resources so that
you do not have memory leaks or other similar issues.
Anywhere, call fail("message") to stop the current service.
The caller can check result.err.bool().
If you donβt check but try to use a value, the error will bubble up until it reaches a place that does.
Your code waits for computations to conclude by called services only when values are used.
// main.s
@include std.core
service divide(f64 x, f64 y)
if y==0.0
fail("Division by zero")
return x/y
service main()
r = divide(1.0, 0.0)
if r.err.bool()
print("Could not compute.")
else
print(r)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded Division by zero Could not compute.
This approach keeps the happy path simple; you try running the service and, if it fails, decide what the next step is (ask again, use a default, stop).
Sometimes we want to treat several different types as if they were βthe same kind of thing.β
For example, an integer, a float, and a signed integer are all numbers with similar operations defined.
In other words, we can write code that reads largely the same for all three types of numbers.
In smoΞ», you can avoid repeating that code by defining a union.
The latter groups together several types under a single name and you can define
functions that work with the union instead of each type separately. Its definition looks like
an assignment, where the right-hand-side types are separated by or.
union Number = f64 or u64 or i64
This definition means that Number can represent
u64 class="language-smolambda",
f64, or i64.
Once the union is defined, you can write
functions that take a Number and automatically work with whichever
form it has. For example, the standard library has printing and arithmetic operator overloads:
so you can call print on a Number and it βjust works.β
Unions are lexically scoped within functions; the same name always represents the same type.
Thus, in the example below both numbers must be of the same type.
But you can add a union as part of another one to transfer all types under a different name.
// main.s
@include std.core // defines Number
def add1mul(Number a, Number b)
// convert 1 to the Number correct primitive
one = 1.Number()
return (a+one)*(b+one)
service main()
print(add1mul(1,2))
print(add1mul(1.0, 2.0))
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 6 6.000000
Behind the scenes, smoΞ» makes sure that union usage is safe and the right code runs depending on which variant you are using; new functions immediately check if they can work for all variations of union members to report errors early.
Unions let use overload function names (like print) by dynamically
adapting them on the type. However, only the declared member types are allowed.
They further quickly create zero-cost overloaded variations of functions with the same code.
Think of unions as a way to say:
βI donβt care if this is an i64 or an f64 β
as long as itβs a Number, I can use the overloaded operations
to write the same code conceptually.β
At this point, you may be wondering how could smoΞ» run operations that consume an indeterminate amount of memory. For example, you might want to combine strings or manage lists of arbitrary sizes.
This is where buffers and
@include std.mem come into play.
We start from touching on concepts of the latter, and explain about
buffer later. Broadly, there are two memory
"devices" exposed by smoΞ»;
the Stack and Heap. These correspond to the small but fast
memory your operating system uses to run your program, and the full extend of available memory.
Those devices provide memory allocation capabilities, but also grant you the ability to allocate more
memory either dynamically as needed or as very fast arenas.
Memory follows the same pattern as
all resources that smoΞ» uses; it is automatically freed when not in use anymore.
This is achieved with some minor restrictions that the compiler will let you know occasionally. For example,
you can declare memory (for example to use with @on
contexts) only outside of loops that would leak the memory in each iteration.
The benefit is that there is no running overhead or bottleneck for testing when memory is no longer needed and freeing it. Memory is released back to the operating system at function ends without unforeseen runtime checks. That is, memory management is a zero-cost abstraction.
Arenas: Arenas are pre-allocated memory regions (either on the heap or the stack) that cannot grow. They are typically made to consume a safe amount of memory and ensure that programs can successfully run. This often comes with some wastage if you cannot anticipate their size. New allocations on arenas are lightweight (they are simple memory address additions) and therefore execute quickly. In smoΞ», arenas that run out of space cause services to fail, which is always done safely and without leaking resources.
The next example shows how to work with arenas. The @on
statement automatically adds the arena as a first argument to all operations inside the current code block.
// main.s
@include std.core
@include std.mem
@include std.vec
service main()
// allocate a stack arena of 1024 bytes
@mut memory = Stack.arena(1024)
@on memory // automatically apply as first argument
v = vector(5) // calls vector(memory,5)
print(v.len())
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 5
If you switch to Heap instead of Stack, the same code
would allocate on random access memory. The details of how heap
memory is managed may be modified for different systems by passing
a different runtime to the compiler.
In the above snippet, mutability is needed to allow modification of memory contents. However, temporary values thar have not yet been assigned to any symbol are mutable. You can also return a value from that block, therefore making the following equivalent syntax possible. Remember that the arena's consumed memory is released only when the vector is no longer needed.
//main.s
@include std.core
@include std.mem
@include std.vec
service main()
@on Stack.arena(1024)
v = vector(5)
@on ok // empty @on context for safety
print(v.len())
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 5
Dynamic memory:
You can also choose to maintain a collection of dynamically allocated memory segments,
for example by calling @on Heap.dynamic() instead
of creating an arena. If multiple of those segments are attached to the same dynamic
allocation, they are also released together, which means that they remain allocated as
long as any is in use. However, you can also make a new allocation each time you need some
dynamic memory.
SmoΞ» provides three main string types:
cstr (a constant string enclosed in quotations during compilation, null-terminated),
nstr (null-terminated with length information), and
str (a string segment). For most purposes, you can convert the other two to
str. That said, there are some places where the other two are needed,
as null-termination, that is, having an extra zero character at the end of strings, is needed when exchangin information
with the operating system.
Most operations on strings are zero-cost abstractions over simple arithmetics, as happens
for example when retrieving substrings. The most heavyweight operations are copying and
concatenation, both of which require an arena or dynamic memory to store the result. Below
is an example, where it is important to note that concatenation yields a nstr
by default, which can be directly reduced to a simpler str if needed.
Similarly, conversion of cstr to the other types is lightweight.
// main.s
@include std.core
@include std.mem
def Segment(new, str value)
// return all inputs
// function returns are tuples of named elements
return @args
def Segment(String _value)
// convert from many string types
value = _value.str()
return new.Segment(value)
def combine(Segment[] segments)
@mut combined = "".str() // mutable string with known size
@on Stack.arena(1024) // automatically use as argument if needed (for string operations)
segments
.len()
.range()
.while next(@mut u64 i)
combined = add(@all combined segments[i].value " ").str()
return combined
service main()
segments = Segment[] // buffer
.push("I think.".Segment())
.push("Therefore I am.".Segment())
segments.combine().print()
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded I think. Therefore I am.
Buffers represent collections of data that grow dynamically.
The simplest syntax is type[], which produces
a resizable array of that type. This is only allowed if the type is a new one.
You can grow buffers or modify their elements only as long as they remain mutable. For example,
you can create a buffer per v = u64[].push(1).push(2)
that is henceforth immutable.
// main.s
@include std.core
def data(u64 id, u64[] values)
return @args
service main()
@mut vals = u64[]
.push(1)
.push(2)
p = data[].push(data(10, vals))
print(p[0].id)
print(p[0].values[0])
print(vals[0])
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 10 1 1
Custom buffer allocation: By default, buffers use dynamically sized Heap memory to store their data. However, you can
also create them on memory allocations provided by the standard library; all types of memory
accept an ,allocate(size) function for setting aside a fixed-size region, within
which buffers can grow. Below is an example. Do note that data can be pushed onto buffers during
their creation, while they are temporary variables -and hence mutable- and the result can be
assigned to an immutable variable.
// main.s
@include std.core
service main()
vals = u64[Heap.allocate(1024)] // 1024 max buffer size
.push(1)
.push(2)
print(vals[0])
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 1
Buffers preserve mutability rules: only if a buffer itself is mutable can its contents can be changed or extended. That said, you can assign immutable buffers to mutable ones and alter the contents of the latter. Strings can also be stored in buffers or arrays given that they reside in the same memory surface. Below is an example.
\\ main.s
@include std.core
@include std.mem
service main()
boxes = str[]
.push("buffer start".str())
.push("buffer end".str())
b = boxes[0]
print(b)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded buffer start
Here we pushed two strings into a string array.
Notice the explicit conversion to the correct string version using .str().
Buffers can be returned from services normally. In this case, their
memory is released by the returning service.
In the example below, samples constructs and returns a buffer.
The caller can then access its elements with buf[index].
@include std.core
service samples()
buf = u64[]
.push(42)
.push(10)
return buf
service main()
buf = samples()
print(buf[0])
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded 42
You can share the same data between multiple buffers, even if those are mutable.
This can allow one to make changes to another. You can even use an allocated
memory region to declare a char[] buffer
and then use the same region to allocate strings. Safety is preserved in that allocated
memory regions accept only the same kinds of data; the compiler runs some analysis
to determine this and creates an error if such problematic behavior is found.
As a result, replaced data may be modified but they will always be safe to read
and write. Below is an example:
@include std.core
@include std.mem
service main()
@mut memory = Heap.allocate(1.KB())
@mut buf = char[memory]
.push("H".str().first)
.push("i".str().first)
.push("!".str().first)
s = memory.str(3)
print(s)
// now modify the buffer - it modifies the string
buf[2] = "#".str().first
print(s)
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 27ms compile --back gcc --runtime std/runtime/auto.h 68ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded Hi! Hi#
In most languages, recursion means that a function calls itself, or that it calls another function that eventually calls itself. This is useful, but it usually comes at a cost: every call stacks on top of the previous one, and if you recurse too deeply, you might run out of memory (a βstack overflowβ). SmoΞ» functions can only look at previously declared or imported functions in an effort to protect such unconstrained behavior. But the same recursive computations are still possible.
In particular, instead of creating dependent calls, the language uses a trick called trampolining. Think of it like a ball bouncing back and forth between two or more functions: each one decides what the next step is, and we loop until the process finishes. This way, recursion does not pile up memory; it just reuses a single loop.
Below is an example that demonstrates how to do this using a tag primitive that
we have not addressed so far. That consists of writing
;name and behaves like a number with a given value.
Importantly, tag values, can overlap with function names, which a @dynamic
instruction uses to its advantage to execute different functions given different tag values.
The tag primitive type lets us reference :ping or :pong anywhere,
even before those are being defined. Then, the @dynamic instruction chooses to call a function
among various options (ping or pong here) depending on the next tag value. Additional
tag values are added as first arguments.
If something unexpected happens, such as another tag was found, the service just fails safely, without memory leaks or crashes.
At the same time, types are safe and are checked during compilation.
// main.s
@include std.core
def lambda(tag func, u64 n)
return @args
def ping(u64 n, cstr message)
printin(message)
print("ping")
return :pong.lambda(n)
def pong(u64 n, cstr message)
if n == 0
return :done.lambda(u64)
printin(message)
print("pong")
return :ping.lambda(n-1)
service main()
@mut pending = :pong.lambda(2)
while :done!=pending.func
pending = @dynamic(ping,pong) pending("next - ")
> smol main.s codegen --workers 1 ββββββββββ 7/7 files 22ms compile --back gcc --runtime std/runtime/auto.h 66ms running tests/unit/tutorial/hello πΉ https://github.com/maniospas/smol πΉ single threaded next - pong next - ping next - pong next - ping
To sum up, in the example above ping and pong call each other back and forth.
Instead of diving deeper into the stack, they just return a next state marked with a tag
(:ping, :pong, or :done).
The main loop keeps track of which step is next, and runs it until it reaches the :done tag.
So whenever you would normally write a recursive function, you can instead write it
as a loop with tags. It feels a bit like building a tiny βinterpreterβ
for your recursive logic that is safe and efficient, and performant. Do note that
@dynamic may inject code from all functions it
handles, so do not overuse it to avoid creating exceptionally large executables.
// main.s
@include std.core
def lambda(tag func, u64 n, u64 a, u64 b)
return @args
def fib_start(u64 n)
// initialize Fibonacci sequence with first two numbers
return :fib_step.lambda(n, 0, 1)
def fib_step(u64 n, u64 a, u64 b)
if n == 0
return :done.lambda(u64, a, b)
// compute next step
return :fib_step.lambda(n-1, b, a+b)
service main()
n = 10
@mut pending = :fib_start.lambda(n)
while :done != pending.func
pending = @dynamic(fib_start, fib_step) pending()
// when done, print result (b is the n-th Fibonacci number)
print("Fibonacci result:")
print(pending.b)