Wrote some tutorials

This commit is contained in:
2022-10-02 10:42:35 +11:00
committed by GitHub
parent b34f84cc34
commit 9e09279b87
4 changed files with 198 additions and 15 deletions

View File

@@ -12,10 +12,11 @@ The host will provide all of the extensions needed on a case-by-case basis. Scri
* Simple C-like syntax
* Bytecode intermediate compilation
* `import` and `export` variables from the host program
* Optional, but robust type system
* functions and types are first-class citizens
* `import` and `export` variables from the host program
* Fancy slice notation for strings, arrays and dictionaries
* Can re-direct output, error and assertion failure messages
* Open source under the zlib license
# Getting Started
@@ -23,6 +24,8 @@ The host will provide all of the extensions needed on a case-by-case basis. Scri
* [Quick Start Guide](quick-start-guide)
* Tutorials
* [Embedding Toy](embedding-toy)
* [Compiling Toy](compiling-toy)
* [Using Toy](using-toy)
* ~~[Standard Libary](standard-library)~~
* [Types](types)
* [Developing Toy](developing-toy)

98
compiling-toy.md Normal file
View File

@@ -0,0 +1,98 @@
# Compiling Toy
This tutorial is a sub-section of [Using-Toy](using-toy) that has been spun off into it's own page for the sake of brevity/sanity. It's recommended that you read the main article first.
The exact phases outline here are entirely implementation-dependent - that is, they aren't required, and are simply how the canonical version of Toy works.
## How the Compilation works
There are four main phases to running a Toy source file. These are:
```
lexing -> parsing -> compiling -> interpreting
```
Each phases has a dedicated set of functions and structures, and there are intermediate structures between these stages that carry the information from one set to another.
```
source -> lexer -> token
token -> parser -> AST
AST -> compiler -> bytecode
bytecode -> interpreter -> result
```
## Lexer
Exactly how the source code is loaded into memory is left up to the user, however once it's loaded, it can be bound to a `Lexer` structure.
```c
Lexer lexer;
initLexer(&lexer, source);
```
The lexer, when invoked, will produce a break down the string of characters into individual `Tokens`.
The lexer does not need to be freed after use, however the source code does.
## Parser
The `Parser` structure takes a `Lexer` as an argument when initialized.
```c
Parser parser;
initParser(&parser, &lexer);
ASTNode* node = scanParser(&parser);
freeParser(&parser);
```
The parser takes tokens, one at a time, and converts them into structures called Abstract Syntax Trees, or ASTs for short. Each AST represents a single top-level statement within the Toy script. You'll know when the parser is finished when `scanParser()` begins returning `NULL` pointers.
The AST Nodes produced by `scanParser()` must be freed manually, and the parser itself should not be used again.
## Compiler
The actual compilation phase has two steps - instruction writing and collation.
```c
size_t size;
Compiler compiler;
initCompiler(&compiler);
writeCompiler(&compiler, node);
unsigned char* tb = collateCompiler(&compiler, &size);
freeCompiler(&compiler);
```
The writing step is the process in which AST nodes are compressed into bytecode instructions, while literal values are extracted and placed aside in a cache (usually in an intermediate state).
The collation phase, however is when the bytecode instructions, along with the now flattened intermediate literals and function bodies are combined. The bytecode header specified in [Developing Toy](developing-toy) is placed at the beginning of this blob of bytes during this step.
The Toy bytecode (abbreviated to `tb`), along with the `size` variable indicating the size of the bytecode, are the result of the compilation.
This bytecode can be saved into a file for later consumption by the host at runtime - ensure that the file has the `.tb` extension.
The bytecode loaded in memory is consumed and freed by `runInterpreter()`.
## Interpreter
The interpreter acts based on the contents of the bytecode given to it.
```c
Interpreter interpreter;
initInterpreter(&interpreter);
runInterpreter(&interpreter, tb, size);
freeInterpreter(&interpreter);
```
Exactly how it accomplishes this task is up to it - as long as the result matches expectations.
## REPL
An example program, called `toyrepl`, is provided alongside Toy's core. This program can handle many things, such as loading, compiling and executing Toy scripts; it's capable of compiling any valid Toy program for later use, even those that rely on non-standard libraries.
To get a list of options, run `toyrepl -h`.

View File

@@ -27,17 +27,3 @@ There are some strict rules when interpreting these values (mimicking, but not c
All interpreter implementations retain the right to reject any bytecode whose header data does not conform to the above specification.
The latest version information can be found in [common.h](https://github.com/Ratstail91/Toy/blob/0.6.0/source/common.h#L7-L10)
## Embedded API
The functions intended for usage by the API are prepended with the C macro `TOY_API`. The exact value of this macro can vary by platform, or even be empty.
In addition, the macros defined in [literal.h](https://github.com/Ratstail91/Toy/blob/0.6.0/source/literal.h) are available for use when manipulating literals. These include:
* `IS_*` - check if a literal is a specific type
* `AS_*` - use the literal as a specific type
* `TO_*` - create a literal of a specific type
* `IS_TRUTHY` - check if a literal is truthy
* `MAX_STRING_LENGTH` - the maximum length of a string in Toy (can be altered if needed)
When you create a new Literal object, be sure to call `freeLiteral()` on it afterwards! If you don't, your program will leak memory as Toy has no internal tracker for such things.

96
using-toy.md Normal file
View File

@@ -0,0 +1,96 @@
# Using Toy
This tutorial assumes that you've managed to embed Toy into your program by following the tutorial [Embedding Toy](embedding-toy).
Here, we'll look at some ways in which you can utilize Toy's C API within your host program.
Be ware that when you create a new Literal object, you must call `freeLiteral()` on it afterwards! If you don't, your program will leak memory as Toy has no internal tracker for such things.
## Embedded API Macros
The functions intended for usage by the API are prepended with the C macro `TOY_API`. The exact value of this macro can vary by platform, or even be empty. In addition, the macros defined in [literal.h](https://github.com/Ratstail91/Toy/blob/0.6.0/source/literal.h) are available for use when manipulating literals. These include:
* `IS_*` - check if a literal is a specific type
* `AS_*` - cast the literal to a specific type
* `TO_*` - create a literal of a specific type
* `IS_TRUTHY` - check if a literal is truthy
* `MAX_STRING_LENGTH` - the maximum length of a string in Toy (can be altered if needed)
## Structures Used Throughout Toy
The main unit of data within Toy's internals is `Literal`, which can contain any value that can exist within the Toy langauge. The exact implementation of `Literal` may change or evolve as time goes on, so it's recommended that you only interact with literals directly by using the macros and functions outlined [above](#embedded-api-macros). See the [types](types) page for information on what datatypes exist in Toy.
There are two main "compound structures" used within Toy's internals - the `LiteralArray` and `LiteralDictionary`. The former is an array of `Literal` instances stored sequentially in memory for fast lookups, while the latter is a key-value hashmap designed for efficient lookups based on a `Literal` key. These are both accessible via the language as well.
These compound structures hold **copies** of literals given to them, rather than taking ownership of existing literals.
## Compiling Toy Scripts
Please see [Compiling Toy](compiling-toy) for more information on the process of turning scripts into bytecode.
## Interpreting Toy
The `Interpreter` structure is the beating heart of Toy - You'll usually only need one interpreter, as it can be reset as needed.
The four basic functions are used as follows:
```c
//assume "tb" and "size" are the results of compilation
Interpreter interpreter;
initInterpreter(&interpreter);
runInterpreter(&interpreter, tb, size);
resetInterpreter(&interpreter); //You usually want to reset between runs
freeInterpreter(&interpreter);
```
In addition to this, you might also wish to "inject" a series of usable libraries into the interpreter, which can be `import`-ed within the language itself. This process only needs to be done once, after initialization, but before the first run.
```c
injectNativeHook(&interpreter, "standard", hookStandard);
```
A "hook" is a callback function which is invoked when the given library is imported. `standard` is the most commonly used library available.
```
import standard;
```
Hooks can simply inject native functions into the current scope, or they can do other, more esoteric things (though this is not recommended).
```c
//a utility structure for storing the native C functions
typedef struct Natives {
char* name;
NativeFn fn;
} Natives;
int hookStandard(Interpreter* interpreter, Literal identifier, Literal alias) {
//the list of available native C functions that can be called from Toy
Natives natives[] = {
{"clock", nativeClock},
{NULL, NULL}
};
//inject each native C functions into the current scope
for (int i = 0; natives[i].name; i++) {
injectNativeFn(interpreter, natives[i].name, natives[i].fn);
}
return 0;
}
```
## Calling Toy from C
In some situations, you may find it convenient to call a function written in Toy from the host program. For this, a pair of utility functions have been provided:
```c
TOY_API bool callLiteralFn(Interpreter* interpreter, Literal func, LiteralArray* arguments, LiteralArray* returns);
TOY_API bool callFn (Interpreter* interpreter, char* name, LiteralArray* arguments, LiteralArray* returns);
```
The first argument must be an interpreter. The third argument is a pointer to a `LiteralArray` containing a list of arguments to pass to the function, and the fourth is a pointer to a `LiteralArray` where the return values can be stored (an array is used here for a potential future feature). The contents of the argument array is consumed and left in an indeterminate state (but is safe to free), while the returns array always has one value - if the function did not return a value, then it contains a `null` literal.
The second arguments to these functions are either the function to be called as a `Literal`, or the name of the function within the interpreter's scope. The latter API simply finds the specified `Literal` if it exists and calls the former. As with most APIs, these return `false` if something went wrong.