Low-level code

Unsafe code

Crow is normally safe. Normal Crow functions can't do bad things like:

  • Allocate memory and never free it.
  • Free memory, then access it anyway.
  • Access global state.
  • Have different behavior on different environments (like endianness or the operating system).
    (A function that simply prints the name of your operating system would be safe, but summon.)
  • Access other information not explicitly passed through parameters, like a stack trace.
  • Access private information about data structures. (E.g., distinguish two equal values based on reference identity.)

However, unsafe Crow running natively can do anything C can.
Although most code should do things the safe way, unsafe functionality is available in functions marked unsafe.
You can also mark any of your own functions unsafe if it must be called in a particular way.
An unsafe function can call other unsafe functions.

Many (but not all) unsafe operations also require the function to be marked native extern. This will be explained in Extern functions instead of here.

main void() unsafe, native extern x string* = null info log "{x.to::nat}"

The above example prints 0 because that happens to be the representation of a null pointer.

"Trusted" expressions

When a function is unsafe, it doesn't mean that every time you use it, it will do something bad.
It just means that it could, and the compiler can't verify that it won't.

If you're calling an unsafe function in a way that won't do something bad, you can mark the use as trusted.
trusted parses as a prefix, just like throw.

A common example of this is + functions on integer types, which check for overflow/underflow before calling a primitive unchecked add function.

import system/stdio: stdout system/unistd: write main void() summon, (native, posix) extern message nat8 array = "hello\n"::string to-bytes _ = trusted stdout write message.begin-pointer, message.size

The above example calls the Posix write function, which is dangerous in general; it takes a pointer and length separately and does bad things if length is wrong.
(Anything involving pointers is unsafe; so begin-pointer is also unsafe.)
But in this example, we know that the length is correct, making the call to write safe.

The entire argument expression to trusted may contain unsafe functions.

When deciding whether to mark a function as unsafe or to use trusted in its body, ask whether a malicious caller could cause the function to bad things. (For example, a function taking pointer parameters is almost always unsafe, since it can't guarantee that the pointers are valid.)
If so, always mark it as unsafe.

For convenience, trusted can also be used as a function modifier. This makes the entire function body trusted.

Pointers

Crow supports pointers like in C. You can manually allocate memory, and mark the code trusted if you've ensured it will be freed.

Unlike in C, pointers are "const" by default. nat* is a readable pointer; nat mut* is readable and writable.

import system/stdlib: free, malloc main void() trusted, (libc, native) extern ptr nat mut* = new-nat finally ptr free-nat *ptr := 3 info log "{*ptr}" new-nat nat mut*() unsafe, (libc, native) extern (size-of@nat).malloc pointer-cast free-nat void(a nat mut*) unsafe, (libc, native) extern a.pointer-cast free

As in C, you can use &x to get a pointer to a local variable.
As in C, this is dangerous if the pointer outlives the stack.

main void() trusted, native extern x mut nat = 1 ptr nat mut* = &x *ptr := 2 info log "{x}"

The above can be trusted because the pointer doesn't outlive x.

Converting mutable and read-only pointers

Crow doesn't have any implicit conversions. To convert a constant to a mutable pointer, use as-mut. To convert back, use as-const.

main void() trusted, native extern x mut nat = 1 mut-ptr nat mut* = &x const-ptr nat* = mut-ptr as-const *mut-ptr := 2 assert const-ptr.as-mut == mut-ptr info log "{*const-ptr}"

Accessing data from pointers

Reading through a pointer is slightly different from in C.
If a is a pointer to a record with a field x, a->x reads (or writes) the field.
a.x will also be defined, and return an inner pointer of a at the x field.
In C that would be written as &a->x, which is not supported in Crow.

main void() trusted, native extern foo mut foo = 1, foo-ptr foo mut* = &foo x-ptr nat mut* = foo-ptr.x foo-ptr->x := 2 info log "{foo.x} == {foo-ptr->x} == {*x-ptr}" *x-ptr := 3 info log "{foo.x} == {foo-ptr->x} == {*x-ptr}" foo record(x mut nat) by-val, mut

Global variables

Globals are declared using the global keyword.
The below example declares a global x, writes to it, then reads it.

main void() unsafe x := 1 info log "{x}" x global(nat)

Declaring the global generates a getter function x and a setter function set-x (called by x := 1).
Both of these are unsafe.

It's hard to use a global variable in a safe way. You would have to be sure:

  • The global is only accessed by one thread at a time.
  • Global state does not leak across unrelated calls. (Basically, it must "act local".)

Basically, you should never use globals. But they are needed to support C libraries that use them.

Thread-local variables

A thread-local variable works just like a global, but it can have a different value for each thread.
This means a real thread, not a crow task. Which task gets assigned to which thread is unpredictable, so don't rely on it.

Again, these are included for compatibility with C.
They are also used in a few places in the standard library behind the scenes.

main void() trusted x := 1 info log "{x}" x thread-local(nat)

The above example can be trusted because the variable is always written before it is read, so state won't leak across unrelated calls.
(We also need to know that x isn't used anywhere else; we do know this, since it's private.)

Function pointers

There are also function pointer types like in C.

A function pointer type is written like a lambda type, but with the function keyword where the purity would normally appear.
(Function pointers are shared since they might point to a summon function, and so could have side effects.)
To get a function pointer, write & followed by the function name.

main void() f nat function(x nat) = &double info log "{f[5]}" double nat(a nat) a * 2

"By-val" and "by-ref"

Normally, Crow will choose for you whether a record type is a reference type or a value type.
(union, enum, and flags are always value types.)
Crow guesses what will have the best performance; it passes small records by value and large records reference. If a record has a mutable field, it is always passed by reference (to ensure a mutation isn't lost by a copy).

You can override the default by explicitly marking a record by-val or by-ref.

main void() unsafe, native extern # Value type, so its size is the sum of its fields' sizes. info log "r1: {size-of@r1}" # Reference type, so its size is the size of a pointer. info log "r2: {size-of@r2}" r1 record(x nat, y nat) by-val r2 record(x nat, y nat) by-ref

In the above example, the records would behave the same for safe code; the only difference is performance.

"Bare" functions

A function may be marked bare if it does not use the Crow runtime.
Basically, this means it should not allocate any garbage-collected data. That means it can only create records explicitly marked by-val. It also can't do anything involving tasks, such as using then or with : parallel.

The main reason to make a function bare is if C code might call it in an asynchronous callback. Crow code may only allocate if it's part of a Crow fiber, which wouldn't be the case then.

main void() # non-'bare' can call 'bare' foo foo void() bare # Can't allocate _ nat[] = 1, # Can't use the runtime _ nat = with : parallel 1 ()

A bare function can only call other bare functions, but any function can call a bare function.
(E.g., + is bare and anything can call it.)
Recall that unsafe works in the opposite way.

Memory slices (buffers)

Much low-level code deals with slices of memory (also known as buffers).
The Crow types t[] and t mut-slice are guaranteed to be contiguous slices of memory when compiling to native code.
So, they have begin-pointer and end-pointer functions.
To go in the reverse direction (from pointers to a slice), there are functions as-array and as-mut-slice.

t mut[] also has begin-pointer and end-pointer functions, but the pointers aren't guaranteed to last.
A t mut[] is not directly a slice of memory but a reference to a slice which may be re-allocated when elements are added or removed from the array.
You can (unsafely) access this slice using temp-as-array and temp-as-mut-slice.

Often you will want to use a function like read to add elements to a mut[]. To do that, you can use a push-gc-safe-values n to allocate space in the mut[], then start writing to a.end-pointer - n (where n is the amount of space you need).
When done writing, a reduce-size-to x if you didn't use all the space you allocated. (GC-safe values are not safe in general, so it's important to not leave them accessible by safe code.)