Files
Toy/.notes/bytecode-format.txt

65 lines
2.7 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
The bytecode format
===
NOTE: This datestamp header is currently not implemented
There are four components in the datestamp header:
TOY_VERSION_MAJOR
TOY_VERSION_MINOR
TOY_VERSION_PATCH
TOY_VERSION_BUILD
The first three are each one unsigned byte, and the fourth is a null terminated C-string.
* Under no circumstance, should you ever run bytecode whose major version is different
* Under no circumstance, should you ever run bytecode whose minor version is above the interpreters minor version
* You may, at your own risk, attempt to run bytecode whose patch version is different from the interpreters patch version
* You may, at your own risk, attempt to run bytecode whose build version is different from the interpreters build version
An additional note: The contents of the build string may be anything, such as:
* the compilation date and time of the interpreter
* a marker identifying the current fork and/or branch
* identification information, such as the developer's copyright
* a link to Risk Astley's "Never Gonna Give You Up" on YouTube
Please note that in the final bytecode, if the null terminator of TOY_VERSION_BUILD is not 4-byte aligned, extra space will be allocated to round out the header's size to a multiple of 4. The contents of the extra bytes are undefined.
===
Bytecode Format Structure
.header:
N total size # size of this routine, including all data and subroutines
N .jumps count # the number of entries in the jump table (should be data count + routine count)
N .param count # the number of parameter fields expected (used for subroutines)
N .data count # the number of data fields present
N .subs count # the number of subroutines present
.code start # absolute address of .code; mandatory
.param start # absolute addess of .param; omitted if not needed
.datatable start # absolute address of .datatable; omitted if not needed
.data start # absolute address of .data; omitted if not needed
.subs start # absolute address of .subs; omitted if not needed
# additional metadata fields can be added later
.code:
# opcode instructions read and 'executed' by the interpreter (aligned to 4-byte widths)
[READ, TOY_VALUE_STRING, Toy_StringType, stringLength] [jumpIndex]
.param:
# a list of names, stored in .data, to be used for any provided function arguments
.jumptable:
# a layer of indirection for quickly looking up values in .data and .subs
0 -> {string, 0x00}
4 -> {fn, 0xFF}
.data:
# data that can't be cleanly embedded into .code, such as strings
"Hello world\0"
.subs:
# an extension of .data, used exclusively for subroutines (they also follow this spec, recursively)