Sorted out some folders for the docs

2026-04-15 14:54:07 +10:00 · 2023-02-16 21:44:06 +11:00
parent 329d085beb
commit 0b885d0a30
15 changed files with 30 additions and 65 deletions
--- a/deep-dive/building-toy.md
+++ b/deep-dive/building-toy.md
@@ -0,0 +1,68 @@
+# Building Toy
+
+This tutorial assumes you're using git, GCC, and make.
+
+To embed toy into your program, simply clone the [git repository](https://github.com/Ratstail91/Toy) into a submodule - here we'll assume you called it `Toy`.
+
+Toy's makefile uses the variable `TOY_OUTDIR` to define where the output of the build command will place the result. You MUST set this to a value, relative to the Toy directory.
+
+```make
+export LIBDIR = lib
+export TOY_OUTDIR = ../$(LIBDIR)
+```
+
+Next, you'll want to run make the from within Toy's `source`, assuming the output directory has been created. There are two options for building Toy - `library` (default) or `static`; the former will create a shared library (and a .dll file on windows), while the latter will create a static library.
+
+```make
+toy: $(LIBDIR)
+	$(MAKE) -C Toy/source
+
+$(LIBDIR):
+	mkdir $(LIBDIR)
+```
+
+Finally, link to the outputted library, and specify the source directory to access the header files.
+
+```make
+all: $(OBJ)
+	$(CC) $(CFLAGS) -o $(OUT) $(OBJ) -L../$(LIBDIR) $(LIBS)
+```
+
+Here's a quick example makefile template you can use:
+
+```make
+CC=gcc
+
+export OUTDIR = out
+export LIBDIR = lib
+export TOY_OUTDIR = ../$(LIBDIR)
+
+IDIR+=. ./Toy/source
+CFLAGS+=$(addprefix -I,$(IDIR))
+LIBS+=-ltoy
+
+ODIR=obj
+SRC=$(wildcard *.c)
+OBJ=$(addprefix $(ODIR)/,$(SRC:.c=.o))
+
+OUT=./$(OUTDIR)/program
+
+all: toy $(OUTDIR) $(ODIR) $(OBJ)
+	$(CC) $(CFLAGS) -o $(OUT) $(OBJ) -L$(LIBDIR) $(LIBS)
+	cp $(LIBDIR)/*.dll $(OUTDIR) # for shared libraries
+
+toy: $(LIBDIR)
+	$(MAKE) -C Toy/source
+
+$(OUTDIR):
+	mkdir $(OUTDIR)
+
+$(LIBDIR):
+	mkdir $(LIBDIR)
+
+$(ODIR):
+	mkdir $(ODIR)
+
+$(ODIR)/%.o: %.c
+	$(CC) -c -o $@ $< $(CFLAGS)
+```
--- a/deep-dive/compiling-toy.md
+++ b/deep-dive/compiling-toy.md
@@ -0,0 +1,98 @@
+# Compiling Toy
+
+This tutorial is a sub-section of [Embedding Toy](deep-dive/embedding-toy) that has been spun off into it's own page for the sake of brevity/sanity. It's recommended that you read the main article first.
+
+The exact phases outlined here are entirely implementation-dependent - that is, they aren't required, and are simply how the canonical version of Toy works.
+
+## How the Compilation works
+
+There are four main phases to running a Toy source file. These are:
+
+```
+lexing -> parsing -> compiling -> interpreting
+```
+
+Each phase has a dedicated set of functions and structures, and there are intermediate structures between these stages that carry the information from one set to another.
+
+```
+source   -> lexer       -> token
+token    -> parser      -> AST
+AST      -> compiler    -> bytecode
+bytecode -> interpreter -> result
+```
+
+## Lexer
+
+Exactly how the source code is loaded into memory is left up to the user, however once it's loaded, it can be bound to a `Lexer` structure.
+
+```c
+Toy_Lexer lexer;
+Toy_initLexer(&lexer, source);
+```
+
+The lexer, when invoked, will break down the string of characters into individual `Tokens`.
+
+The lexer does not need to be freed after use, however the source code does.
+
+## Parser
+
+The `Toy_Parser` structure takes a `Toy_Lexer` as an argument when initialized.
+
+```c
+Toy_Parser parser; 
+Toy_initParser(&parser, &lexer);
+
+Toy_ASTNode* node = Toy_scanParser(&parser);
+
+Toy_freeParser(&parser);
+```
+
+The parser takes tokens, one at a time, and converts them into structures called Abstract Syntax Trees, or ASTs for short. Each AST represents a single top-level statement within the Toy script. You'll know when the parser is finished when `Toy_scanParser()` begins returning `NULL` pointers.
+
+The AST Nodes produced by `Toy_scanParser()` must be freed manually, and the parser itself should not be used again.
+
+## Compiler
+
+The actual compilation phase has two steps - instruction writing and collation.
+
+```c
+size_t size;
+Toy_Compiler compiler;
+
+Toy_initCompiler(&compiler);
+Toy_writeCompiler(&compiler, node);
+
+unsigned char* tb = Toy_collateCompiler(&compiler, &size);
+
+Toy_freeCompiler(&compiler);
+```
+
+The writing step is the process in which AST nodes are compressed into bytecode instructions, while literal values are extracted and placed aside in a cache (usually in an intermediate state).
+
+The collation phase, however is when the bytecode instructions, along with the now flattened intermediate literals and function bodies are combined. The bytecode header specified in [Developing Toy](developing-toy) is placed at the beginning of this blob of bytes during this step.
+
+The Toy bytecode (abbreviated to `tb`), along with the `size` variable indicating the size of the bytecode, are the result of the compilation.
+
+This bytecode can be saved into a file for later consumption by the host at runtime - ensure that the file has the `.tb` extension.
+
+The bytecode loaded in memory is consumed and freed by `Toy_runInterpreter()`.
+
+## Interpreter
+
+The interpreter acts based on the contents of the bytecode given to it.
+
+```c
+Toy_Interpreter interpreter;
+Toy_initInterpreter(&interpreter);
+Toy_runInterpreter(&interpreter, tb, size);
+Toy_freeInterpreter(&interpreter);
+```
+
+Exactly how it accomplishes this task is up to it - as long as the result matches expectations.
+
+## REPL
+
+An example program, called `toyrepl`, is provided alongside Toy's core. This program can handle many things, such as loading, compiling and executing Toy scripts; it's capable of compiling any valid Toy program for later use, even those that rely on non-standard libraries.
+
+To get a list of options, run `toyrepl -h`.
+
--- a/deep-dive/developing-toy.md
+++ b/deep-dive/developing-toy.md
@@ -0,0 +1,29 @@
+# Developing Toy
+
+Toy's current version began as a specification, which changed and evolved as it was developed. The original specification was extremely bare-bones and not intended for a general audience, so this website is actually intended not just to teach how to use it, but also aspects of how to expand on the current canonical version of the language.
+
+Here you'll find some of the implementation details, which must remain intact regardless of any other changes.
+
+## Bytecode Header Format
+
+Every instance of Toy bytecode will be divvied up into several sections, by necessity - however the only one with an actual concrete specification is the header. This section is used to define what version of Toy is currently running, as well as to prevent any future version clashes.
+
+The header consists of four values:
+
+* TOY_VERSION_MAJOR
+* TOY_VERSION_MINOR
+* TOY_VERSION_PATCH
+* TOY_VERSION_BUILD
+
+The first three are single unsigned bytes, embedded at the beginning of the bytecode in sequence. These represent the major, minor and patch versions of the language. The fourth value is a null-terminated string of unspecified data, which is *intended* but not required to specify the time that the langauge's compiler was itself compiled. The build string can also hold arbitrary data, such as the current maintainer's name, current fork of the language, or other versioning info.
+
+There are some strict rules when interpreting these values (mimicking, but not conforming to [semver.org](https://semver.org/)):
+
+* Under no circumstance, should you ever run bytecode whose major version is different - there are definitely broken APIs involved.
+* Under no circumstance, should you ever run bytecode whose minor version is above the interpreter's minor version - the bytecode could potentially use unimplemented features.
+* You may, at your own risk, attempt to run bytecode whose patch version is different.
+* You may, at your own risk, attempt to run bytecode whose build version is different.
+
+All interpreter implementations retain the right to reject any bytecode whose header data does not conform to the above specification.
+
+The latest version information can be found in [toy_common.h](https://github.com/Ratstail91/Toy/blob/main/source/toy_common.h#L7-L10)
--- a/deep-dive/embedding-toy.md
+++ b/deep-dive/embedding-toy.md
@@ -0,0 +1,119 @@
+# Embedding Toy
+
+This tutorial assumes that you've managed to embed Toy into your program by following the tutorial [Building Toy](deep-dive/building-toy).
+
+Here, we'll look at some ways in which you can utilize Toy's C API within your host program.
+
+Be aware that when you create a new Literal object, you must call `Toy_freeLiteral()` on it afterwards! If you don't, your program will leak memory as Toy has no internal tracker for such things.
+
+## Embedded API Macros
+
+The functions intended for usage by the API are prepended with the C macro `TOY_API`. The exact value of this macro can vary by platform, or even be empty. In addition, the macros defined in [literal.h](https://github.com/Ratstail91/Toy/blob/0.6.0/source/literal.h) are available for use when manipulating literals. These include:
+
+* `TOY_IS_*` - check if a literal is a specific type
+* `TOY_AS_*` - cast the literal to a specific type
+* `TOY_TO_*` - create a literal of a specific type
+* `TOY_IS_TRUTHY` - check if a literal is truthy
+* `TOY_MAX_STRING_LENGTH` - the maximum length of a string in Toy (can be altered if needed)
+
+## Structures Used Throughout Toy
+
+The main unit of data within Toy's internals is `Toy_Literal`, which can contain any value that can exist within the Toy langauge. The exact implementation of `Toy_Literal` may change or evolve as time goes on, so it's recommended that you only interact with literals directly by using the macros and functions outlined [above](#embedded-api-macros). See the [types](types) page for information on what datatypes exist in Toy.
+
+There are two main "compound structures" used within Toy's internals - the `Toy_LiteralArray` and `Toy_LiteralDictionary`. The former is an array of `Toy_Literal` instances stored sequentially in memory for fast lookups, while the latter is a key-value hashmap designed for efficient lookups based on a `Toy_Literal` key. These are both accessible via the language as well.
+
+These compound structures hold **copies** of literals given to them, rather than taking ownership of existing literals.
+
+## Compiling Toy Scripts
+
+Please see [Compiling Toy](compiling-toy) for more information on the process of turning scripts into bytecode.
+
+## Interpreting Toy
+
+The `Toy_Interpreter` structure is the beating heart of Toy - You'll usually only need one interpreter, as it can be reset as needed.
+
+The four basic functions are used as follows:
+
+```c
+//assume "tb" and "size" are the results of compilation
+Toy_Interpreter interpreter;
+
+Toy_initInterpreter(&interpreter);
+Toy_runInterpreter(&interpreter, tb, size);
+Toy_resetInterpreter(&interpreter); //You usually want to reset between runs
+Toy_freeInterpreter(&interpreter);
+```
+
+In addition to this, you might also wish to "inject" a series of usable libraries into the interpreter, which can be `import`-ed within the language itself. This process only needs to be done once, after initialization, but before the first run.
+
+```c
+Toy_injectNativeHook(&interpreter, "standard", Toy_hookStandard);
+```
+
+A "hook" is a callback function which is invoked when the given library is imported. `standard` is the most commonly used library available.
+
+```
+import standard;
+```
+
+Hooks can simply inject native functions into the current scope, or they can do other, more esoteric things (though this is not recommended).
+
+```c
+//a utility structure for storing the native C functions
+typedef struct Natives {
+	char* name;
+	Toy_NativeFn fn;
+} Natives;
+
+int Toy_hookStandard(Toy_Interpreter* interpreter, Toy_Literal identifier, Toy_Literal alias) {
+	//the list of available native C functions that can be called from Toy
+	Natives natives[] = {
+		{"clock", nativeClock},
+		{NULL, NULL}
+	};
+
+	//inject each native C functions into the current scope
+	for (int i = 0; natives[i].name; i++) {
+		Toy_injectNativeFn(interpreter, natives[i].name, natives[i].fn);
+	}
+
+	return 0;
+}
+```
+
+## Calling Toy from C
+
+In some situations, you may find it convenient to call a function written in Toy from the host program. For this, a pair of utility functions have been provided:
+
+```c
+TOY_API bool Toy_callLiteralFn(Toy_Interpreter* interpreter, Toy_Literal func, Toy_LiteralArray* arguments, Toy_LiteralArray* returns);
+TOY_API bool Toy_callFn       (Toy_Interpreter* interpreter, char* name,       Toy_LiteralArray* arguments, Toy_LiteralArray* returns);
+```
+
+The first argument must be an interpreter. The third argument is a pointer to a `Toy_LiteralArray` containing a list of arguments to pass to the function, and the fourth is a pointer to a `Toy_LiteralArray` where the return values can be stored (an array is used here for a potential future feature). The contents of the argument array is consumed and left in an indeterminate state (but is safe to free), while the returns array always has one value - if the function did not return a value, then it contains a `null` literal.
+
+The second arguments to these functions are either the function to be called as a `Toy_Literal`, or the name of the function within the interpreter's scope. The latter API simply finds the specified `Toy_Literal` if it exists and calls the former. As with most APIs, these return `false` if something went wrong.
+
+## Memory Allocation
+
+Depending on your platform of choice, you may want to alter how the memory is allocated within Toy. You can do this with the simple memory API:
+
+```c
+//signature returns the new pointer to be used
+typedef void* (*Toy_MemoryAllocatorFn)(void* pointer, size_t oldSize, size_t newSize);
+TOY_API void Toy_setMemoryAllocator(Toy_MemoryAllocatorFn);
+```
+
+Pass it a function which matches the above signature, and it'll be callable via the following macros:
+
+* `TOY_ALLOCATE(type, count)`
+* `TOY_FREE(type, pointer)`
+* `TOY_GROW_ARRAY(type, pointer, oldCount, newCount)`
+* `TOY_SHRINK_ARRAY(type, pointer, oldCount, newCount)`
+* `TOY_FREE_ARRAY(type, pointer, oldCount)`
+
+Also, the following macros are provided to calculate the ideal array capacities (the latter of which is for rapidly growing structures):
+
+* `TOY_GROW_CAPACITY(capacity)`
+* `TOY_GROW_CAPACITY_FAST(capacity)`
+
--- a/deep-dive/roadmap.md
+++ b/deep-dive/roadmap.md
@@ -0,0 +1,40 @@
+# Roadmap
+
+There's a few things I'd like to do with the langauge, namely bugfixes, new features and implementing a game with it.
+
+## Game And Game Engine
+
+The Toy programming langauge was designed from the beginning as though it was supposed to be embedded into an imaginary game engine. Well, now that the lang is nearly feature complete, it's time to start on that engine.
+
+To that end, I've begun working on this: [Airport Game](https://github.com/Ratstail91/airport).
+
+This is a simple game concept, which I can implement within a reasonable amount of time, before extracting parts to create the engine proper. It feels almost like a mobile game, so I'm hoping this engine will be runnable on android (though at the time of writing, I've yet to investigate how).
+
+## New Features I'd Like One Day
+
+Some things I'd like to add in the future include:
+
+* ~~A fully featured standard library (see below)~~
+* ~~An external script runner utility library~~
+* A threading library
+* A random generation library (numbers, perlin noise, wave function collapse?)
+* ~~A timer library~~
+* Multiple return values from functions
+* ~~Ternary operator~~
+* interpolated strings
+* ~~MSVC compilation~~
+
+Some of these have always been planned, but were sidelined or are incomplete for one reason or another.
+
+## Nope Features
+
+Some things I simply don't want to include at the current time include:
+
+* Classes & Structures
+* Do-while loops
+
+This is because reworking the internals to add an entirely new system like this would be incredibly difficult for very little gain.
+
+Ironically, I've never used a do-while loop seriously until I started implementing this language.
+
+
--- a/deep-dive/testing-toy.md
+++ b/deep-dive/testing-toy.md
@@ -0,0 +1,7 @@
+# Testing Toy
+
+Toy uses GitHub CI for comprehensive automated testing - however, all of the tests are under `test/`, and can be executed by running `make test`. Doing so on linux will attempt to use valgrind; to disable using valgrind, pass in `DISABLE_VALGRIND=true` as an environment variable.
+
+The tests consist of a number of different situations and edge cases which have been discovered, and should probably be thoroughly tested one way or another. There are also several "-bugfix.toy" scripts which explicitly test a bug that has been encountered in one way or another. The libs that are stored in `repl/` are also tested - their tests are under `/tests/scripts/lib`; some error cases are also checked by the mustfail tests in `/test/scripts/mustfail`.
+
+GitHub CI also has access to the option `make test-sanitized` which attempts to use memory sanitation. I don't know enough about this to offer much comentary, only that several invisible issues are monitored this way.
--- a/deep-dive/theorizing-toy.md
+++ b/deep-dive/theorizing-toy.md
@@ -0,0 +1,32 @@
+# Theorizing Toy
+
+Sooner or later, every coder will try to create their own programming language. In my case, it took me over a decade and a half to realize that was an option, but once I did I read through a fantastic book called [Crafting Interpreters](https://craftinginterpreters.com/). This sent me down the rabbit hole, so to speak.
+
+Now, several years later, after multiple massive revisions, I think my language is nearing version 1.0 - not bad for a side project.
+
+The main driving idea behind the language has remained the same from the very beginning. I wanted a scripting language that could be embedded into a larger host program, which could be easily modified by the end user. Specifically, I wanted to enable easy modding for games made in an imaginary game engine.
+
+At the time of writing, I've started working on said engine, building it around Toy and adjusting Toy to fit the engine as needed. I don't know how long the engine's development will take, but I personally think the best way to build an engine is to build a game first, and then extract the engine from it. Thus, the project is currently called "airport", though the engine proper will likely have a name like "box" or "toybox".
+
+But this post isn't about the engine, it's about Toy - I want to explain, in some detail, my thought processes when developing it. Let's start at the beginning:
+
+```
+print "Hello world";
+```
+
+I've drawn the `print` keyword from Crafting Interpreter's Lox language, for much the same reason as explained in the book - it's a simple and easy way to debug issues - it's not intended for actual production use. You'll be able to print out any kind of value or variable from this statement - but it loses some context like function implementations and the values of `opaque` literals.
+
+Let's touch on variables quickly - There's about a dozen variable types that can be used, depending on how you count them. This grew as the language progressed, and there are actually several literal types which you can't directly access (they're only used internally).
+
+There's also functions (which are also a type of literal), which are reusable chunks of bytecode that can be invoked with their names. OH! I haven't even talked about bytecode yet - one of the interesting aspects of Toy is that the source code must be compiled into an intermediate bytecode format (a trait also inherited from Lox) before it can be executed by the interpreter. Even I'm not entirely sure how the internals of how the bytecode is layed out in it's entirety, as the parsing and compilation steps take liberties when producing the final usable chunk of memory.
+
+I was originally not entirely certain that compiling to bytecode was the right choice as, for most programs to function and remain moddable, the source will need to be compiled on-the-fly. But after extensive benchmarking, it turns out that the compilation is the fastest part of execution.
+
+There's one major feature of most programming languages that Toy is missing - input. Currently, there's no standardized way to receive input from the user - however this will likely be aleviated by various custom libraries down the road.
+
+One such example would be a game controller library - something which takes in button presses, and calls certain Toy functions to move a character around the game world. The thing is, not every game will need a controller - that's why each library is optional, and can be provided or omitted at the host's discretion. As a result, Toy is almost infinitely extensible, as most good scripting languages are.
+
+I don't know how well this langauge will do in the wild, once it gets some battle testing from actual users - but I do know that it'll become more and more of a grizzled beast as time goes on - that's inevitable for any piece of code. However, I would like to keep the core language nice and simple, as much as possible - something you can explain with just the quickstart page.
+
+Feedback, and constructive criticism are always welcome.
+