smoλ blog – Memory safety via delayed freeing

Memory safety via delayed freeing

by Emmanouil (Manios) Krasanakis – September 6, 2025 - Programming languages, safety, delayed free

This post is Part I of a series on how smoλ handles resources. And I can already hear you sighing.

"Yet another take on memory safety. Aren't there enough out there?"

If you are a little bit familiar with programming language design, this is a probable first reaction. And, honestly, you would be right. But I believe there is a useful insight or two here. Not to mention that snippets like the next one are not going to explain themselves within smoλ's standard library. So I might as well give an explanation somewhere.

// from std/mem.device.s
def ContiguousMemory (
        nominal, 
        MemoryDevice, 
        u64 size,
        Primitive,
        ptr mem,
        ptr underlying
    ) 
    -> @args

// from std/mem/arena.s
def Arena(nominal type, ContiguousMemory contents)
    @noborrow
    length = 0
    size = contents.size
    with contents.Primitive:is(char)
    ---> type, contents, length, size

Don't feel guilty if your first reaction was to skip the above snippet. It's exactly what I would have done, too, because it makes little sense out-of-context. If you are more careful than me, you could also look more closely and decide that, with bloated data structures like these to move around, who needs chromium in every app munching on our memory? - we are going to consume it all ourselves first! And, wait, did I not say something about safety? Where's that?

I will address these concerns with a series of blog posts, starting from the very basics of how one can have a safe memory release model in this one. I will cover mutability and a fast memory system that can easily pivot between arenas and dynamic memory in next installments.

In particular, here I will explain some of the zero-cost design principles of smoλ that help delay resource cleanup until it's safe. Principles discussed apply to any resource, such as several kinds of memory management and files.

Recap on language concepts#

Runtypes - Before starting, a brief recap that smoλ is centered around runtypes; functions whose returned values serve as both tuples and type fields. For example, the following declares a nominal type, so that if you write p=nominal:point2(x,y) you can access fields p.x and p.y.

def point2(nominal, f64 _x, f64 _y) 
    x = 2*_x 
    y = 2*_y 
    -> x,y // return statement

Nominal types - I shamelessly forced upon you a couple more concepts in the above explanation, because they are pretty important. First is nominal, which is a value indicating a type matched to its name instead of structure (two f64 values). The language's typing is static, so nominal indicators are checked during compilation but removed and do not affect execution.

Currying - The language's currying symbol : transfers the left-hand-sight as the first argument to the right. For example, obj:fun means fun(obj) and obj:fun(arg1,arg2) means fun(obj,arg1,arg2), and so on.

def vs services - In general, there are two ways to declare runtypes; as smo declarations, like above, or as services. The main difference is that the former are inlined, whereas services are implemented as co-routines and give the opportunity of error handling when errors occur inside - to be addressed in another post.

Where memory is not used #

A nice feature of smoλ is that its default runtype manipulation mechanism does not use heap memory. Instead, registers and the stack store fields as local variables. For example, the previous point2 runtype is simply stored into underlying variables p__x.p__y.

Remember that def definitions are inlined, so we can have very complex data transfers be zero cost (or faster-than-zero-cost, depending on how you look at it) by performing variable substitution and elimination. Currently, the language transpiles to an unholy subset of C with goto as its intermediate representation. Mostly because I am uncomfortable around the LLVM stack's bloat. So these variable optimizations are deliberately delegated to modern C compilers (by default: gcc), which are very adept at them. What is C, after all, if not portable assembly? Jokes aside, LLVM is an engineering wonder and I am being stuborn here, but this would be the least of my moral failings.

As an example, if the value p.y is never used, p__y will be eliminated by the compilation process completely. Even if used elsewhere, inlining eventually means that we do not need to load p.y from memory because if it is already in the cache. Of course, cache locality is still key. But our implementation is equipped with the tools to allow compilation to actually optimize for that.

The end result: we are free to keep "metadata" variables that help us track resource properties, such as ptr underlying to track the common allocated memory, and u64 size to track the current accessible size. The costs of those variables are either eliminated by removing dead computations or would have been payed either way. For example, z=point2(point2(x,y).x,y) print(z.x+z.y) has the same compilation outcome as the minimal required operations print(2*2*x+2*y).

As a final note, optimizations do not occur for service arguments, since the latter are proper functions. But the cost there is negligible compared to co-routine switching (which is already reasonably lightweight but incures costs due to mutexes). Importantly, only one variable -typically of a pointer builtin type ptr- is sufficient to hold attached resources.

For example, generic strings are declared as follows, where memory represents the memory resource they have been attached to, and first the first character to reduce cache misses when comparing strings by comparing that instead of the contents.

// from std/builtins/str.s

def str (
        nominal, // no storage or runtime cost
        ptr contents, 
        u64 length, 
        char first, 
        ptr memory 
    ) 
    -> @args

Release timing #

Concurency aside, the main difference between smo and service is that resources like memory are gathered by the compiler and their release code is inserted at the end of service calls. Those working in domains where delays are critical are now aware of exactly when systems are going to do work. But we also have the benefit that, upon failures like invalid file reads, we can safely terminate the service without leaking resources. Error handling is also not for now, though.

In the simplest case, there would only be the main service and all releases would occur at its end - just before program termination. By the way, releasing all resources related to a variable can also be done manually with @release var. In this case, smoλ will complain if you try to use/leak a resource later.

Something important to consider is what happens to returned data. For those, we just delay freeing until they are not used anymore - as if they were allocated by the calling service. The mechanism for doing so consists of detecting resource frees at compile time and delegating them to the caller. In short: if A calls B, the act of releasing resources on which B's return depends is delegated to A.

Practical example - Below is an example using the language's built-in buffer system, where buffers are created by placing [] after a type. Default buffers reside on the heap to account for their dynamic nature, though this can be adjusted for systems with different memory models by adding new runtimes in the std/runtimes/ folder. Fast implementations from the standard library for strings and vectors use other memory constructs described next.

@include std.core // basic arithmetics, etc

service samples()
    buf = u64[]
    :push(42)
    :push(10)
    -> buf // return statement

service main()
    buf = samples()
    print(buf[0]) // prints 42
    -- // end block, buffer is deallocated here

Formalism - A particular family of programming fans may be ready to point out that I follow linear logic for ownership (hello there, Rust enthusiasts!). This is not entirely true, however, because I actually use linear time logic (LTL); the statement I am making is that "eventually allocated memory is deallocated and at the end of a services after which it is not referenced anymore". Ahem! Trying very hard to not spam notation here. For reals. But I promise to release a white paper for this... eventually. Just notice that the nature of this promise completely eliminates use-after-free. Though it obviously comes at the cost for keeping some memory alive for longer.

Safe arenas - The same principle is in action for other types of resources, such as arenas. I am not going ito cover the memory model implemented by the standard library in this post, so it suffices to know that arenas (per their common definition) are pre-allocated chunks of memory where new allocations are as simple as incrementing pointers. The downside is that their size should be fixed beforehand.

Without going into syntax details, in smoλ one can open an arena context per on Heap:arena(size). Memory contexts like this essentially provide an allocation mechanism for string concatenations and vector numeric operators. You can also replace the heap with Stack if you plan to remain within a service's call stack. But the compiler will create an error if you would try to return stack memory from a service. Or you could use on Heap:dynamic to allow dynamic allocations that are all freed up together.

The example below uses an arena for string concatenations. Do note that the language has three types of strings: cstr corresponding to raw text enclosed in quotations, like "there", null terminated strings nstr, which are normally the outcome of concatenation, and str that are either null-terminated or a zero-cost substring view. Each of those types is convertible to subsequent ones through zero-cost abstractions. The reason I am mentioning string types is to explain why we need to explicitly convert "there" to str in the code.

@include std.core
@include std.mem

service greet(str name)
    on Heap:arena(1024) // preallocate 1kB on the heap
        greeting = "Hi "+name+"!"
    ---> greeting

service main()
    greeting = greet("there":str)
    print(greeting)
    -- // end block, arena is deallocated here

Type details out of the way, we allocated a disproportionately large arena for string concatenation. Then, this arena's freeing operations are attached to its allocated pointer, with string results keeping track of it. Hence, when the greeting string is returned, it is also accompanied by the whole arena and its deallocated code. Of course "returning" the arena has no execution overhead other than moving its pointer value.

Automatic releases are similar to declaring destructors, but deallocation code is identified statically. Furthermore, despite the slightly bloated memory consumption in this case, speed gains from arenas (or fixed-size buffers) are what people often refer to when they mention the term "blazingly fast"; you get the benefits of minimized memory fragmentation, cache locality, and near-instantaneous allocation.

Declaring releases #

Smoλ aims to automate resource allocation and deallocation for higher level data types. Still, you might be curious what interfaces the language provides to allow the addition of new resource types. The way to transfer freeing code is by attaching it to underlying variables and moving it alongside those variables' values during compilation. I repeat, because it is worth repeating: we track memory VALUES during compilation, not variables. Because we need to reason about the status of memory contents.

Below is an example of how resource acquisition looks like; this is how Heap memory can allocate a contiguous memory segment, for example to be used by arenas. Allocation code is different for different types of memory; for instance, arenas themselves implement allocate based on pointer additions.

The code below is a bit of a mess because it interweaves raw C. Any data that stores mem's value retains a link to the freeing code, and there is an additional dependency exploration during returns to have only one of those (either mem or -preferred if it exists- a returned value) own the freeing code. Note that @unsafe is needed to convey to the author of this file -me- is who you should be trusting instead of the language. Now, I would not trust me too much, but this is another story...

// From std/mem/device.s, comments only here.

// File-level @about is needed whenever @unsafe 
// is declared to get a sense of why one could 
// trust this unsafe file. The compiler can  
// summarize these sources of presumed trust.
@unsafe
@about "Standard library implementation of memory management ..."

def allocate(Heap, u64 size, Primitive)
    // Usually optimized away if, for example
    // one called Heap:allocate(1024, char)
    if size==0 
        -> fail("Cannot allocate zero size")

    // C header
    @head{#include <stdlib.h>}

    // A hack to declare a local variable with the 
    // appropriate builtin type (among char,f64,u64,
    // i64) that the C code in @body below can see.
    // Optimized away.
    Primitive = Primitive 

    // Direct C code here.
    @body{ptr mem=__runtime_alloc(size*sizeof(Primitive));} // malloc usually

    // Allocation safety - just fail services whenever.
    if mem:bool:not 
        -> fail("Failed a Heap allocation")

    // Attach freeing C code to the mem ptr address.
    // That is managed by the compiler and is curated to run correctly
    // even if the code above fails triggering the release (unitilized
    // variables are always set to zero).
    @finally mem { 
        if(mem)
            __runtime_free(mem); // free usually
        mem=0;
    }
    -> nominal:ContiguousMemory(
        Heap, // track type as zero-valued object (optimized away later)
        size, // size of region within underlying memory
        Primitive, // track type as zero-valued object (optimized away later)
        mem,  // region start within underlying memory
        mem   // underlying memory
    )

Notice that there is a ton of information injected with the indent of tracking it during compilation but to be optimized away later. The resulting code may end up being an malloc, allocation test, and eventual free. But the richness of all intermediate information lets us do nice things during compilation, such as track where the memory should be released and the primitive alignment (64-bit for numbers or 8-bit for char primitives).

This post is part of the smoλ language's material. For more resources, check out these links: