Wrote some tutorials

2026-04-15 23:04:08 +10:00 · 2022-10-02 10:42:35 +11:00
parent b34f84cc34
commit 9e09279b87
4 changed files with 198 additions and 15 deletions
--- a/README.md
+++ b/README.md
@@ -12,10 +12,11 @@ The host will provide all of the extensions needed on a case-by-case basis. Scri

 * Simple C-like syntax
 * Bytecode intermediate compilation
-* `import` and `export` variables from the host program
 * Optional, but robust type system
 * functions and types are first-class citizens
+* `import` and `export` variables from the host program
 * Fancy slice notation for strings, arrays and dictionaries
+* Can re-direct output, error and assertion failure messages
 * Open source under the zlib license

 # Getting Started
@@ -23,6 +24,8 @@ The host will provide all of the extensions needed on a case-by-case basis. Scri
 * [Quick Start Guide](quick-start-guide)
 * Tutorials
  * [Embedding Toy](embedding-toy)
+  * [Compiling Toy](compiling-toy)
+  * [Using Toy](using-toy)
  * ~~[Standard Libary](standard-library)~~
  * [Types](types)
 * [Developing Toy](developing-toy)
--- a/compiling-toy.md
+++ b/compiling-toy.md
@@ -0,0 +1,98 @@
+# Compiling Toy
+
+This tutorial is a sub-section of [Using-Toy](using-toy) that has been spun off into it's own page for the sake of brevity/sanity. It's recommended that you read the main article first.
+
+The exact phases outline here are entirely implementation-dependent - that is, they aren't required, and are simply how the canonical version of Toy works.
+
+## How the Compilation works
+
+There are four main phases to running a Toy source file. These are:
+
+```
+lexing -> parsing -> compiling -> interpreting
+```
+
+Each phases has a dedicated set of functions and structures, and there are intermediate structures between these stages that carry the information from one set to another.
+
+```
+source   -> lexer       -> token
+token    -> parser      -> AST
+AST      -> compiler    -> bytecode
+bytecode -> interpreter -> result
+```
+
+## Lexer
+
+Exactly how the source code is loaded into memory is left up to the user, however once it's loaded, it can be bound to a `Lexer` structure.
+
+```c
+Lexer lexer;
+initLexer(&lexer, source);
+```
+
+The lexer, when invoked, will produce a break down the string of characters into individual `Tokens`.
+
+The lexer does not need to be freed after use, however the source code does.
+
+## Parser
+
+The `Parser` structure takes a `Lexer` as an argument when initialized.
+
+```c
+Parser parser; 
+initParser(&parser, &lexer);
+
+ASTNode* node = scanParser(&parser);
+
+freeParser(&parser);
+```
+
+The parser takes tokens, one at a time, and converts them into structures called Abstract Syntax Trees, or ASTs for short. Each AST represents a single top-level statement within the Toy script. You'll know when the parser is finished when `scanParser()` begins returning `NULL` pointers.
+
+The AST Nodes produced by `scanParser()` must be freed manually, and the parser itself should not be used again.
+
+## Compiler
+
+The actual compilation phase has two steps - instruction writing and collation.
+
+```c
+size_t size;
+Compiler compiler;
+
+initCompiler(&compiler);
+writeCompiler(&compiler, node);
+
+unsigned char* tb = collateCompiler(&compiler, &size);
+
+freeCompiler(&compiler);
+```
+
+The writing step is the process in which AST nodes are compressed into bytecode instructions, while literal values are extracted and placed aside in a cache (usually in an intermediate state).
+
+The collation phase, however is when the bytecode instructions, along with the now flattened intermediate literals and function bodies are combined. The bytecode header specified in [Developing Toy](developing-toy) is placed at the beginning of this blob of bytes during this step.
+
+The Toy bytecode (abbreviated to `tb`), along with the `size` variable indicating the size of the bytecode, are the result of the compilation.
+
+This bytecode can be saved into a file for later consumption by the host at runtime - ensure that the file has the `.tb` extension.
+
+The bytecode loaded in memory is consumed and freed by `runInterpreter()`.
+
+## Interpreter
+
+The interpreter acts based on the contents of the bytecode given to it.
+
+```c
+Interpreter interpreter;
+initInterpreter(&interpreter);
+runInterpreter(&interpreter, tb, size);
+freeInterpreter(&interpreter);
+```
+
+Exactly how it accomplishes this task is up to it - as long as the result matches expectations.
+
+## REPL
+
+An example program, called `toyrepl`, is provided alongside Toy's core. This program can handle many things, such as loading, compiling and executing Toy scripts; it's capable of compiling any valid Toy program for later use, even those that rely on non-standard libraries.
+
+To get a list of options, run `toyrepl -h`.
+
--- a/developing-toy.md
+++ b/developing-toy.md
@@ -27,17 +27,3 @@ There are some strict rules when interpreting these values (mimicking, but not c
 All interpreter implementations retain the right to reject any bytecode whose header data does not conform to the above specification.

 The latest version information can be found in [common.h](https://github.com/Ratstail91/Toy/blob/0.6.0/source/common.h#L7-L10)
-
-## Embedded API
-
-The functions intended for usage by the API are prepended with the C macro `TOY_API`. The exact value of this macro can vary by platform, or even be empty.
-
-In addition, the macros defined in [literal.h](https://github.com/Ratstail91/Toy/blob/0.6.0/source/literal.h) are available for use when manipulating literals. These include:
-
-* `IS_*` - check if a literal is a specific type
-* `AS_*` - use the literal as a specific type
-* `TO_*` - create a literal of a specific type
-* `IS_TRUTHY` - check if a literal is truthy
-* `MAX_STRING_LENGTH` - the maximum length of a string in Toy (can be altered if needed)
-
-When you create a new Literal object, be sure to call `freeLiteral()` on it afterwards! If you don't, your program will leak memory as Toy has no internal tracker for such things.
--- a/using-toy.md
+++ b/using-toy.md
@@ -0,0 +1,96 @@
+# Using Toy
+
+This tutorial assumes that you've managed to embed Toy into your program by following the tutorial [Embedding Toy](embedding-toy).
+
+Here, we'll look at some ways in which you can utilize Toy's C API within your host program.
+
+Be ware that when you create a new Literal object, you must call `freeLiteral()` on it afterwards! If you don't, your program will leak memory as Toy has no internal tracker for such things.
+
+## Embedded API Macros
+
+The functions intended for usage by the API are prepended with the C macro `TOY_API`. The exact value of this macro can vary by platform, or even be empty. In addition, the macros defined in [literal.h](https://github.com/Ratstail91/Toy/blob/0.6.0/source/literal.h) are available for use when manipulating literals. These include:
+
+* `IS_*` - check if a literal is a specific type
+* `AS_*` - cast the literal to a specific type
+* `TO_*` - create a literal of a specific type
+* `IS_TRUTHY` - check if a literal is truthy
+* `MAX_STRING_LENGTH` - the maximum length of a string in Toy (can be altered if needed)
+
+## Structures Used Throughout Toy
+
+The main unit of data within Toy's internals is `Literal`, which can contain any value that can exist within the Toy langauge. The exact implementation of `Literal` may change or evolve as time goes on, so it's recommended that you only interact with literals directly by using the macros and functions outlined [above](#embedded-api-macros). See the [types](types) page for information on what datatypes exist in Toy.
+
+There are two main "compound structures" used within Toy's internals - the `LiteralArray` and `LiteralDictionary`. The former is an array of `Literal` instances stored sequentially in memory for fast lookups, while the latter is a key-value hashmap designed for efficient lookups based on a `Literal` key. These are both accessible via the language as well.
+
+These compound structures hold **copies** of literals given to them, rather than taking ownership of existing literals.
+
+## Compiling Toy Scripts
+
+Please see [Compiling Toy](compiling-toy) for more information on the process of turning scripts into bytecode.
+
+## Interpreting Toy
+
+The `Interpreter` structure is the beating heart of Toy - You'll usually only need one interpreter, as it can be reset as needed.
+
+The four basic functions are used as follows:
+
+```c
+//assume "tb" and "size" are the results of compilation
+Interpreter interpreter;
+
+initInterpreter(&interpreter);
+runInterpreter(&interpreter, tb, size);
+resetInterpreter(&interpreter); //You usually want to reset between runs
+freeInterpreter(&interpreter);
+```
+
+In addition to this, you might also wish to "inject" a series of usable libraries into the interpreter, which can be `import`-ed within the language itself. This process only needs to be done once, after initialization, but before the first run.
+
+```c
+injectNativeHook(&interpreter, "standard", hookStandard);
+```
+
+A "hook" is a callback function which is invoked when the given library is imported. `standard` is the most commonly used library available.
+
+```
+import standard;
+```
+
+Hooks can simply inject native functions into the current scope, or they can do other, more esoteric things (though this is not recommended).
+
+```c
+//a utility structure for storing the native C functions
+typedef struct Natives {
+	char* name;
+	NativeFn fn;
+} Natives;
+
+int hookStandard(Interpreter* interpreter, Literal identifier, Literal alias) {
+	//the list of available native C functions that can be called from Toy
+	Natives natives[] = {
+		{"clock", nativeClock},
+		{NULL, NULL}
+	};
+
+    //inject each native C functions into the current scope
+	for (int i = 0; natives[i].name; i++) {
+		injectNativeFn(interpreter, natives[i].name, natives[i].fn);
+	}
+
+	return 0;
+}
+```
+
+## Calling Toy from C
+
+In some situations, you may find it convenient to call a function written in Toy from the host program. For this, a pair of utility functions have been provided:
+
+```c
+TOY_API bool callLiteralFn(Interpreter* interpreter, Literal func, LiteralArray* arguments, LiteralArray* returns);
+TOY_API bool callFn       (Interpreter* interpreter, char* name,   LiteralArray* arguments, LiteralArray* returns);
+```
+
+The first argument must be an interpreter. The third argument is a pointer to a `LiteralArray` containing a list of arguments to pass to the function, and the fourth is a pointer to a `LiteralArray` where the return values can be stored (an array is used here for a potential future feature). The contents of the argument array is consumed and left in an indeterminate state (but is safe to free), while the returns array always has one value - if the function did not return a value, then it contains a `null` literal.
+
+The second arguments to these functions are either the function to be called as a `Literal`, or the name of the function within the interpreter's scope. The latter API simply finds the specified `Literal` if it exists and calls the former. As with most APIs, these return `false` if something went wrong.
+