I've ripped out the deep copying, and flattened the bucket usage, but
this results in no memory being freed or reused for the lifetime of the
program.
This is shown most clearly with this script:
```toy
fn makeCounter() {
var counter: int = 0;
fn increment() {
return ++counter;
}
return increment;
}
var tally = makeCounter();
while (true) {
var result = tally();
if (result >= 10_000_000) {
break;
}
}
```
The number of calls vs amount of memory consumed is:
```
function calls -> memory used in megabytes
1_000 -> 0.138128
10_000 -> 2.235536
100_000 -> 21.7021
1_000_000 -> 216.1712
10_000_000 -> 1520.823
```
Obviously this needs to be fixed, as ballooning to gigabytes of memory
in only a moment isn't practical. That will be the next task - to find
some way to free memory that isn't needed anymore.
See #163, #160
I build a self-referential system, then tried to copy only parts. I need
to step back and adjust my approach.
'Toy_private_deepCopyValue' and 'Toy_private_deepCopyScope' need to be
ripped out, and I need to simply accept there will be only one instance
of 'Toy_Bucket' that isn't freed until the top-level VM is.
I need an hour's break before I'll tackle this again.
See #163
Functions are having issues with being copied around, especially
between buckets, leading to the scopes getting looped. The program gets
stuck in 'incrementRefCount()'.
It's past my time limit, so I'll keep working on it tomorrow with a
fresh mind.
All function stuff is still untested.
See #163
The 'source' directory compiles, but the repl and tests are almost
untouched so far. There's no guarantee that the code in 'source' is
correct, so I'm branching this for a short time, until I'm confident the
whole project passes the CI again.
I'm adjusting the concepts of routines and bytecode to make them more
consistent, and tweaking the VM so it loads from an instance of
'Toy_Module'.
* 'Toy_ModuleBuilder' (formally 'Toy_Routine')
This is where the AST is compiled, producing a chunk of memory that can
be read by the VM. This will eventually operate on individual
user-defined functions as well.
* 'Toy_ModuleBundle' (formally 'Toy_Bytecode')
This collects one or more otherwise unrelated modules into one chunk of
memory, stored in sequence. It is also preprended with the version data for
Toy's reference implementation:
For each byte in the bytecode:
0th: TOY_VERSION_MAJOR
1st: TOY_VERSION_MINOR
2nd: TOY_VERSION_PATCH
3rd: (the number of modules in the bundle)
4th and onwards: TOY_VERSION_BUILD
TOY_VERSION_BUILD has always been a null terminated C-string, but from
here on, it begins at the word-alignment, and continues until the first
word-alignment after the null terminator.
As for the 3rd byte listed, since having more than 256 modules in one
bundle seems unlikely, I'm storing the count here, as it was otherwise
unused. This is a bit janky, but it works for now.
* 'Toy_Module'
This new structure represents a single complete unit of operation, such
as a single source file, or a user-defined function. It is divided into
three main sections, with various sub-sections.
HEADER (all members are unsigned ints):
total module size in bytes
jumps count
param count
data count
subs count
code addr
jumps addr (if jumps count > 0)
param addr (if param count > 0)
data addr (if data count > 0)
subs addr (if subs count > 0)
BODY:
<raw opcodes, etc.>
DATA:
jumps table
uint array, pointing to addresses in 'data' or 'subs'
param table
uint array, pointing to addresses in 'data'
data
heterogeneous data, including strings
subs
an array of modules, using recursive logic
The reference implementation as a whole uses a lot of recursion, so this
makes sense.
The goal of this rework is so 'Toy_Module' can be added as a member of
'Toy_Value', as a simple and logical way to handle functions. I'll
probably use the union pattern, similarly to Toy_String, so functions
can be written in C and Toy, and used without needing to worry which is
which.
Variable declaration was also causing this issue.
It was caused by a value being left on the stack after these statements.
It wasn't a quick fix, as chained assignments depended on it. Now, the
assignment opcode has a configuration option, indicating if the last
value should be left on the stack or not.
This also means the benchmark in 'scripts/benchpress.toy' will no longer
cause a stack overflow.
Fixed#171
The definition of '&&':
Return the first falsy value, or the last value, skipping the evaluation of other operands.
The definition of '||':
Return the first truthy value, or the last value, skipping the evaluation of other operands.
Toy now follows these definitions.
Fixed#154
By leaving 'null' on the stack, it won't cause stack underflows in a
bunch of erroneous situations. This will allow the repl (and other
situations) to continue if they want to.
I've also fixed some error messages in toy_table.c, which were formatted
badly.
Closes#162
Toy now fits into the C spec.
Fixed#158
Addendum: MacOS test caught an error:
error: a function declaration without a prototype is deprecated in all versions of C
That took 3 attempts to fix correctly.
Addendum: 'No new line at the end of file' are you shitting me?
I added reference values in 62ca7a1fb7,
but forgot to mention it. I'm now using references to assign to the
internals of an array, no matter how many levels deep it is.
I had intended to solve the Advent of Code puzzles in Toy, but they
don't work without arrays. I wasn't able to enable arrays in time, so
I need to focus on doing things correctly.
The most immediate tasks are marked with 'URGENT', and a lot of tests
need writing.
I was sidetracked by a strange display bug - turns out it was caused by
pointers - this commit fixes it.
The tests for if-then-else still aren't finished, but I'm knocking off
as it's past my time limit. I've marked 'TODO' and 'URGENT' using
comments, so finding the issues should be easy.
I've also added some support for compiler errors in general, but these
will get expanded on later.
I've also quickly added a valgrind option to the tests and found a few
leaks. I'll deal with these later.
Summary of changes:
* Clarified the lifetime of the bytecode in memory
* Erroneous routines exit without compiling
* Empty VMs don't run
* Added a check for malformed assignments
* Renamed "routine" to "module" within the VM
* VM no longer tries to free the bytecode - must be done manually
* Started experimenting with valgrind, not yet ready
I've brought the tests up to scratch, except for compounds im the
parser, because I'm too damn tired to do that over SSH. It looks like
collections are right-recursive, whixh was unintended but still works
just fine.
I've also added the '--verbose' flag to the repl to control the
debugging output.
Several obscure bugs have been fixed, and comments have been tweaked.
Mustfail tests are still needed, but that's a low priority. See #142.
Fixed#151
The assert keyword works, but I want to add a cmd option to suppress or
disable the errors.
The tests need some serious TLC. I know that, but I'm kicking it down
the road for now.
I was coding earlier this week, but my brain was so foggy I ended up not
knowing what I was doing. After a few days break, I've cleaned up the
mess, which took hours.
Changes:
* Variables can be assigned
* Added new value types as placeholders
* Added 'compare' and 'assign' to the AST
* Added duplicate opcode
* Added functions to copy and free values
* Max name length is 255 chars
* Compound assigns are squeezed into one word
To be completed:
* Tests for this commit's changes
* Compound assignments
* Variable access
Strings, due to their potentially large size, are stored outside of a
routine's code section, in the data section. To access the correct
string, you must read the jump index, then the real address from the
jump table - and extra layer of indirection will result in more flexible
data down the road, I hope.
Other changes include:
* Added string concat operator ..
* Added TOY_STRING_MAX_LENGTH
* Strings can't be created or concatenated longer than the max length
* The parser will display a warning if the bucket is too small for a
string at max length, but it will continue
* Added TOY_BUCKET_IDEAL to correspend with max string length
* The bucket now allocates an address that is 4-byte aligned
* Fixed missing entries in the parser rule table
* Corrected some failing TOY_BITNESS tests