This allows for syntax like `foo['a'] = 1` and more complex assignments
like `foo.bar()[a() + b()] += 1`
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
These were getting annoying. I've disabled these warnings:
* missing-function-docstring
* missing-class-docstring
* missing-module-docstring
* line-too-long
and also squashed an explicit `open()` encoding warning.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
...oops.
The introduction and usage of `Compiler::emit_assign` deferred the
declaration of a name to be declared *after* the RHS of an assignment
had been evaluated. Normally, this is not a problem because you
shouldn't be using the LHS that you're assigning in the RHS
expression... except when you're defining a function.
This has been enabled for functions *only* and well be enabled for other
types as necessary.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
This is an ongoing off-by-one bug that I have not expended enough
brainpower to try to fix. Sometimes lexer outputs were getting chopped
off by one character at the EOF. I have changed a <= to a < to detect if
we're at EOF and that appears to have fixed everything. Getting really
tired of this but hopefully that's all that's needed.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
* Objects will set their __ty__ during instantiation to the Obj type if
no __ty__ has been set
* In Object::get_vtable_attr we were borrowing the type as mutable,
which was not necessary and causing issues when trying to borrow it
mutably twice. This is probably because warnings of various types
have been turned off and that will be investigated soon.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
* __eq__ recursively checks equality (need a way to check for cycles)
* contains checks if the map contains a given index
* get will get an item from the map, but return nil if it doesn't exist
as a key
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
We should have a preliminary implementation of maps going right now.
Thus far we have:
* Map.insert
* Map.__index__
* Map.remove
And some other minor functions. The big news with this is the couple of
pretty hot `unsafe` calls that borrow the VM mutably in two different
closures, simultaneously. This should be safe since these two different
closures aren't being called at the same time, somehow. Maybe one could
be calling the other. But that's not happening (I checked).
This also adds the hashbrown crate to handle the actual hashtable
implementation, so we don't have to implement our own hashtables.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
This is a big one.
For a while, builtin functions were a bit cumbersome and not easily
re-entrant. If you needed to call a function from within a builtin
function, the only method of doing so was to take a `FunctionState`
parameter, which would either be "Begin", meaning the function was being
called for the first time, or "Resume", meaning the function was being
re-entered. This meant that if we wanted to call another function within
this function, we'd have to set up a whole `match` statement to figure
out whether we were re-entering the function or starting out. It was a
mess and not very ergonomic, and most importantly, made it very
difficult to implement hashmaps.
Now, builtin functions are handled a little more elegantly. A native
function is pushed to the stack, where it is detected in the
`Vm::dispatch()` function. It is then called, like normal. If the
builtin function then needs to call *another* function, it will push
that function to the stack and call it, and then call `Vm::resume()` to
resume VM execution. `Vm::dispatch()` is then called again, this time
with the current function on top of the stack. If it's another builtin
function, the above is repeated. If it's a user-defined function, then
bytecode is executed in the main `loop` inside of resume. Ultimately, we
are able to compose builtin functions like we would any other internal
function to the program. Overall this should speed things up a little,
make them a whole lot easier to read, and make them a million times
easier to compose with other builtin parts of Rust.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
* BaseObj felt a bit redundant. For everything that BaseObj did, we use
Obj instead.
* Object::equals was a little weird. It was used for giving back
equality, except when it wasn't. It's a little better defined now,
here's what I'm shooting for:
* *In general*, Object::equals will return true when two objects
refer to the same object.
* The exception to this rule is for "constant" objects, or "copy on
write" objects. These include, but are not limited to: Int, Float,
Bool, Nil, Str. Their base values are immutable and are the heart
of object equality.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
* Obj.__call__ should generically call Obj.__init__, passing along the
arguments given to __call__
* Ty::call function (in Rust) sets up the stack frame correctly now.
Before, the only arguments visible to a `Ty.__call__` function would
have been the values passed to it. However, the Ty.__call__ also needs
*itself* passed to the __call__ function so we can retrieve the
__init__ function and eventually end up calling that.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
Add support for +=, -=, *=, and /= operators. This is basically just
syntactic sugar, but it's still nice to have
a += 1
compiles to the equivalent of
a = a + 1
with all the same implications of scoping rules.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
Previously, the CompileError error messages were just `Debug::fmt`
written to stdout and there wasn't really a backtrace in the code
included. Now, when there is an error in an imported file, it will
display a backtrace of the files included that caused this error.
These are not perfect error messages and are a bit rough around the
edges but they are good enough for now.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
We are converting a 1200 line file into an 800 and 400 line files. It's
actually a lot easier to read now, those visitors rarely ever change and
they get in the way of me reading the file (with my eyes, not with a
program).
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
This brings stuff into the local scope, but it is a little funky with
local scopes that are above the current level (in the same function or
module).
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
* .gitignore now ignores *.got for *anything* under the tests/ directory
* runtests.sh ignores files in the tests/ directory that have the string
"test_import_" in them, so they are not run as tests themselves
* Add a couple of basic module functionality tests
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
This is a big change because it touches a lot of stuff, but here is the
overview:
* Import syntax:
```
import foo
import bar from foo
import bar from "foo.npp"
import bar, baz from foo
import * from foo
import "foo.npp"
```
* These are all valid imports. They should be pretty
straightforward, maybe with exception of the last item. If you are
importing a path directly, but not importing any members from it,
it does not insert anything into the current namespace, and just
executes the file. This is probably going to be unused but I want
to include it for completeness. We can always remove it later
before a hypothetical 1.0 release.
* The "from" keyword is only ever used as a keyword here, and I am
allowing it to be used as an identifier elsewhere. Don't export
it, because that's weird and wrong and won't work.
* Modules:
* Doing an `import foo` will look for "foo.npp" at compile-time,
relative to the importer's directory, parse it, and compile it.
The importer will then attempt to execute the module with the new
`EnterModule` op. This instruction will execute the module kind of
like a function, assigning the module's global namespace to an
object that you can pass around.
* `import bar from foo` and `import bar from "foo.npp"` et al syntax
is not currently implemented in the compiler.
* There is a new "Module" object that represents a potentially
un-initialized module. This can't be referred to directly in code.
* VM:
* The VM operates around Module objects now. If you want to "call" a
new module, you should call `enter_module`. This is how the main
chunk is invoked.
* TODOs:
* `exit_module` function in the VM
* Finish up module implementation in compiler
* Built-in modules
* Sub-modules - e.g. `import foo.bar` - how does naming work for
this?
* Module directories. In Python you have `foo/__init__.py` and in
Rust you have `foo/mod.rs`.
* Probably a "Namespace" object that explicitly denotes "this is an
imported module that you're dealing with"
* Tests, tests, tests
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
We use objects and builders to make AST items now, it's a lot less
clunky that doing string manipulation everywhere in the Python file.
It's also more flexible and will open up more options in the future for
things like enum AST items
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
If an assignment statement has a list as its RHS, it will not create a
constant. We now do a conditional lookup of the last constant index
instead.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
When the compiler inserts a constant, it will first check to see if that
constant already has been created, so we aren't making millions of the
same constant value - e.g. we can reuse the same integer.
However, the .equals() function on all Object values was returning a
false positive against Ints and Floats that have the same numeric value,
i.e. Float(1.0) == Int(1). If, for example, a float 1.0 was inserted as
a constant, and then an integer 1 was used as a constant later, it was
erroneously retrieving the float 1.0 as an interned pointer value.
This is fixed by checking if the two values' types are equal as well.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
Strings can be converted to a list of strings, split up by character.
Strings can also be indexed by character.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
This introduces:
* new syntax for list literals, put comma-separated values between
braces for your new list
* new syntax for indexing, do `foo[index]` to get the value in `foo` at
`index`. Lists also allow negative indices too. Any type that wants to
be indexed can include their own __index__ function as well.
* new VM instruction, BuildList. List literals were a lot easier to
implement using this rather than creating a new list, creating a
temporary stack value, and then duplicating + pushing to that
temporary value over and over.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
If there are warnings in the crate build, we should avoid them while
running each individual test, otherwise we get nonstop warnings.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
They were looking at the first item on the stack, rather than the last
item on the stack, for their `to_repr` value.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
This is hopefully going to make navigating the source tree easier.
Hopefully.
The only types that don't get their own files are:
* function types (UserFunction, BuiltinFunction, Method), which all live
in obj/function.rs
* Nil, which lives in obj.rs
* Obj, which lives in obj.rs
Type definitions and init_types now live in obj/ty.rs.
New obj::prelude module for common imports.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
* I noticed that `fn call(...)` in all objects was identical, so I made
a macro for it. This should make things a little easier to read, since
do_call is about 30 lines a pop.
* Bool has a constructor now, and a to_int and to_float implementations
* Add tests for constructors and add new bool tests
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
* __call__ on a type will construct a new value for a couple of types,
based on the arguments passed to that constructor. For example,
`Str(foo)` just ends up calling `foo.to_str`. However, this opens up
the door for more complex constructors.
* __init__ is available, but for all objects that currently have it, it
just does a no-op because they are copy-on-write, and are
instantiated on creation.
* Builtin functions sometimes call other functions. However, when a VM
would handle an `Op::Return`, it was expecting the callee function to
be on top of the stack after discarding the stack items. A lot of the
.call()s were not pushing the function to the stack beforehand, so
this was causing stack misalignment when it really mattered. It went
undetected until now because every function that was using .call() had
stack items that were safe to discard.
Hopefully we should be in a good place to implement the rest of the
builtins that have not been implemented, and then we can start working
on implementing containers.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
Let's talk about to_repr and to_str.
to_repr tries to do what Python's `repr` function does - that is, it
converts an object into a developer-readable (but maybe not
human-readable) string. This function is implemented for every object,
and may very well just write out "<MyType at 0x12345678>".
to_str, on the other hand, tries to turn an object into an explicitly
human-readable format. In Python (which we are modeling a lot of our
design after), the str() function usually will end up calling `repr()`
itself, if no other implementation has been provided.
Previously in our implementation, there was a bit of a disconnect
between `to_repr` and `to_str`, versus `Debug` and `Display`. `to_repr`
would kind of do its own thing, and then maybe call either `Display` or
`Debug` to format an object. Consequently, `to_str` would kind of do its
own thing too - usually calling `to_repr` but not always.
This change attempts to strengthen the definitions of `to_repr` and
`to_str`. *In general*, a call to `to_repr` should be calling an
object's `Debug::fmt` function, and *in general* a call to `to_str()`
should be calling an object's `Display::fmt` function. Often, the
`Display::fmt` will just end up calling `Debug::fmt` itself, but now
the `to_str()` and `to_repr()` interfaces are much better defined than
they used to be.
The only major downside is that we are giving up the `Debug`
implementation for language logic, rather than
debugging-the-language-itself logic. I can see this biting us down the
road if we ever need a Rust-style `Debug` implementation, but for now, I
think this is going to serve our needs just fine.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
* Finalize is implemented using the procmacro (I didn't realize this was
available)
* The `base` BaseObjInst member for objects is now the first member in
the structure. It will probably be shuffled around by the optimizer
but I prefer it is the first thing so it is clear what these things
are.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
This is a simple one. It does not make sense for an infix language that
uses `-` as a first-class binary (and more importantly, unary) operator.
I liked the idea, but I don't think it was going to work. Plus, I wasn't
using it for builtin functions in the first place, so why keep it
around? Underscores are just fine for our purposes.
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>