Commit Graph

52 Commits

Author SHA1 Message Date
4a7644b84a Update AST generator to be more structured
We use objects and builders to make AST items now, it's a lot less
clunky that doing string manipulation everywhere in the Python file.
It's also more flexible and will open up more options in the future for
things like enum AST items

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-10-03 11:37:00 -07:00
3fb3bf7f91 Squash dead code warning for Method::self_binding
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-10-01 09:31:27 -07:00
80448899d8 Remove nightly features
Don't actually need these since we aren't defining our own pointers

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-30 21:58:21 -07:00
a7d7d8e564 Add List.extend and List.to_list, plus some more tests
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-30 19:53:31 -07:00
a370d3a56f Add List::len method
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-30 17:46:03 -07:00
e5756d6f1a Don't always assume that an assignment statement will result in a constant
If an assignment statement has a list as its RHS, it will not create a
constant. We now do a conditional lookup of the last constant index
instead.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-30 17:39:41 -07:00
8d1cd710b0 Add tests for string indexing and converting lists to strings
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-30 17:36:36 -07:00
9ec12774fd Check type equality when inserting a constant
When the compiler inserts a constant, it will first check to see if that
constant already has been created, so we aren't making millions of the
same constant value - e.g. we can reuse the same integer.

However, the .equals() function on all Object values was returning a
false positive against Ints and Floats that have the same numeric value,
i.e. Float(1.0) == Int(1). If, for example, a float 1.0 was inserted as
a constant, and then an integer 1 was used as a constant later, it was
erroneously retrieving the float 1.0 as an interned pointer value.

This is fixed by checking if the two values' types are equal as well.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-30 17:33:43 -07:00
32b11f1d86 Add Str::index and Str::to_list
Strings can be converted to a list of strings, split up by character.

Strings can also be indexed by character.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-30 16:49:48 -07:00
dab474a037 Add lists
This introduces:

* new syntax for list literals, put comma-separated values between
  braces for your new list
* new syntax for indexing, do `foo[index]` to get the value in `foo` at
  `index`. Lists also allow negative indices too. Any type that wants to
  be indexed can include their own __index__ function as well.
* new VM instruction, BuildList. List literals were a lot easier to
  implement using this rather than creating a new list, creating a
  temporary stack value, and then duplicating + pushing to that
  temporary value over and over.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-30 16:33:58 -07:00
0a21f01ea7 Add RUSTFLAGS=-Awarnings to runtests.sh
If there are warnings in the crate build, we should avoid them while
running each individual test, otherwise we get nonstop warnings.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-30 16:32:52 -07:00
8be3586e42 Fix bug in print and println builtins
They were looking at the first item on the stack, rather than the last
item on the stack, for their `to_repr` value.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-30 16:28:52 -07:00
43183d6553 Most object types get their own file now
This is hopefully going to make navigating the source tree easier.
Hopefully.

The only types that don't get their own files are:

* function types (UserFunction, BuiltinFunction, Method), which all live
  in obj/function.rs
* Nil, which lives in obj.rs
* Obj, which lives in obj.rs

Type definitions and init_types now live in obj/ty.rs.

New obj::prelude module for common imports.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-30 15:15:41 -07:00
724a6b6f99 Add Nil constructor and tests
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-30 12:48:45 -07:00
3d0da0ec85 Add do_call macro, implement Bool builtins, add tests
* I noticed that `fn call(...)` in all objects was identical, so I made
  a macro for it. This should make things a little easier to read, since
  do_call is about 30 lines a pop.
* Bool has a constructor now, and a to_int and to_float implementations
* Add tests for constructors and add new bool tests

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-30 12:43:02 -07:00
b9429d7c19 Add __call__, __init__, and fix a few stack bugs
* __call__ on a type will construct a new value for a couple of types,
  based on the arguments passed to that constructor. For example,
  `Str(foo)` just ends up calling `foo.to_str`. However, this opens up
  the door for more complex constructors.
*  __init__ is available, but for all objects that currently have it, it
   just does a no-op because they are copy-on-write, and are
   instantiated on creation.
* Builtin functions sometimes call other functions. However, when a VM
  would handle an `Op::Return`, it was expecting the callee function to
  be on top of the stack after discarding the stack items. A lot of the
  .call()s were not pushing the function to the stack beforehand, so
  this was causing stack misalignment when it really mattered. It went
  undetected until now because every function that was using .call() had
  stack items that were safe to discard.

Hopefully we should be in a good place to implement the rest of the
builtins that have not been implemented, and then we can start working
on implementing containers.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-30 12:13:25 -07:00
ac6dad9dbd Change to_repr/to_str implementation story
Let's talk about to_repr and to_str.

to_repr tries to do what Python's `repr` function does - that is, it
converts an object into a developer-readable (but maybe not
human-readable) string. This function is implemented for every object,
and may very well just write out "<MyType at 0x12345678>".

to_str, on the other hand, tries to turn an object into an explicitly
human-readable format. In Python (which we are modeling a lot of our
design after), the str() function usually will end up calling `repr()`
itself, if no other implementation has been provided.

Previously in our implementation, there was a bit of a disconnect
between `to_repr` and `to_str`, versus `Debug` and `Display`. `to_repr`
would kind of do its own thing, and then maybe call either `Display` or
`Debug` to format an object. Consequently, `to_str` would kind of do its
own thing too - usually calling `to_repr` but not always.

This change attempts to strengthen the definitions of `to_repr` and
`to_str`. *In general*, a call to `to_repr` should be calling an
object's `Debug::fmt` function, and *in general* a call to `to_str()`
should be calling an object's `Display::fmt` function. Often, the
`Display::fmt` will just end up calling `Debug::fmt` itself, but now
the `to_str()` and `to_repr()` interfaces are much better defined than
they used to be.

The only major downside is that we are giving up the `Debug`
implementation for language logic, rather than
debugging-the-language-itself logic. I can see this biting us down the
road if we ever need a Rust-style `Debug` implementation, but for now, I
think this is going to serve our needs just fine.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-27 08:10:09 -07:00
2f84f2c5bb Fix minor format stnank with genast.py
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-26 11:09:07 -07:00
9d5d094c5b Big Object naming refactor
* trait Obj -> Object
* Remove *Inst suffix from all object types. ObjInst -> Obj, IntInst ->
  Int, etc
* Type -> Ty, type_inst() -> ty(), type_name() -> ty_name()

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-26 11:07:12 -07:00
3a9bee0e35 Shuffle object implementation stuff
* Finalize is implemented using the procmacro (I didn't realize this was
  available)
* The `base` BaseObjInst member for objects is now the first member in
  the structure. It will probably be shuffled around by the optimizer
  but I prefer it is the first thing so it is clear what these things
  are.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-26 10:52:47 -07:00
645105c2c5 Remove kebab-case from parser. It was not being used.
This is a simple one. It does not make sense for an infix language that
uses `-` as a first-class binary (and more importantly, unary) operator.
I liked the idea, but I don't think it was going to work. Plus, I wasn't
using it for builtin functions in the first place, so why keep it
around? Underscores are just fine for our purposes.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-26 10:47:53 -07:00
1dd058ae18 Add binary and hex number parsing
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-26 10:03:54 -07:00
cd9617d2fd Add a few more conversion methods to Int, Float, Str
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-26 09:25:09 -07:00
0d126b8ba3 Add FloatInst method implementations and tests
FloatInst should be fully implemented now and have a suite of tests to
make sure those methods are doing what they should be.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-26 09:03:13 -07:00
f020155453 Add integration tests
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-25 11:42:51 -07:00
38a2064b08 Re-shuffle Obj::type_inst and BaseObjInst::type_inst
There was a moment during the refactor that I was thinking about getting
rid of the `__type__` attribute as the source of truth for the type
instance, but I think that was a bit more than I could chew. However I
forgot to re-add the default implementation for Obj::type_inst, so that
has been added back in.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-25 10:27:53 -07:00
2203957ebb Typesystem global instance churn, again
I don't know if I'm ever going to get this right.

It's a massive pain having to pass around the base "Method" type
everywhere. It really makes a lot more sense to have it already defined
someplace statically available. It makes doing like getting an attribute
or vtable entry a lot more ergonomic. Previously we'd have to pass in
the Method type every time, which was silly. Now we can just let the
MethodInst::instantiate() function query it directly. Like, this is
100000% better.

Also, I got rid of get_attr_lazy in favor of get_vtable_attr. I think
that I want to unify get_attr and get_vtable_attr, but that would
require a GC pointer to the "self" object on every object that you
create. That's a bit iffy.

But for now, things are feeling a little better and all the tests are
passing, so that's good at least.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-25 10:22:03 -07:00
6c64697cde Include some TODOs for functions
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-25 09:04:37 -07:00
f35053a6c1 Fix some bugs uncovered by testing
* To_str on objects will call to_repr by default
* Print() and println() will call to_str by default instead of to_repr
* Fix Str.to_repr to include single quotes
* Fix Int.__pos__ and Int.__neg__ arg counts

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-25 08:48:11 -07:00
11a5a1247e Fix mix-up in the parser
>= and > had gotten mixed up and were being parsed as each other. This
is fixed.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-24 17:16:47 -07:00
c8d670ba59 Implement IntInst methods
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-24 16:57:22 -07:00
890467e02c Compiler emits return instructions
Another failure on my part to write the compiler correctly. oops

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-24 16:54:36 -07:00
ebd5bf96c3 Implement Str methods
* to_str
* to_repr
* to_bool
* len
* __add__
* __mul__

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-24 14:47:44 -07:00
a5d106bdfd Add not_implemented_{un,bin} functions, cleanup unused "not implemented" functions
These are specifically functions for the BaseObjInst:: that need some
kind of "not implemented" function for re-use.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-24 12:46:45 -07:00
ef83796ccc Implement BaseObjInst::eq and ::neq
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-24 12:38:13 -07:00
8f9d634a15 Reorder derives, I guess
Formatters do the darnedest things

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-24 12:37:46 -07:00
56001856be Manually implement Debug for MethodInst
MethodInst's self_binding was causing endless recursion issues, this
just skips over it and uses the normal formatting for it

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-24 12:30:26 -07:00
3e769e9c48 Compile RHS of binary expressions (oopsie)
Not sure how this happened besides being a gigantic moron, I completely
forgot to do `self.compile_expr(&expr.rhs)`.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-24 12:27:53 -07:00
4b5d2af117 Move builtin object methods from src/obj.rs to src/builtins.rs
Builtin functions are now living in the builtins file. They're still
part of BaseObjInst but they just are in a different file. Also,
implement the base "not" function.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-24 11:50:51 -07:00
3a7c04686a Add function yielding and resuming
Sometimes, a builtin function may need to call out to another function
(user-defined or otherwise). Previously, we were just calling the
function and popping the stack frame, leaving no room for the new
function to be called. This introduces a `FunctionResult` and
`FunctionState` that get passed between these builtin functions. A
builtin function will receive a FunctionState that tells it whether it
is currently beginning or being resumed and can act accordingly. A
builtin function will in turn return a FunctionResult, which can either
be to return and push a value to the stack, return without pushing a
value (value is already on top of the stack), or yield execution back to
the VM (implying that a new stack frame has been pushed with a new
function to execute).

Having to call a new function and resume is a bit unwieldy and
un-ergonomic, and making a macro to help write these would be nice, but
it looks like a procedural macro may be required to really enable this.
For now, we will write these yields by hand and once it becomes truly
too much, we can start looking at writing a macro library to handle this
case.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-24 11:34:07 -07:00
d69a60f42c Fix tests that were broken from the last commit
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-24 09:05:31 -07:00
078aef70ea Split up src/obj.rs
* common macros are in their own private module
* functions are in their own obj::function module

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-24 09:03:34 -07:00
d699ad2ff5 Move MethodInst stack mangling to MethodInst::call
When a MethodInst is called as a function in the VM, it needs to push
its `self_binding` member to the stack. Previously, we were downcasting
(if possible) to MethodInst in the VM, but really, we are calling
`MethodInst::call` anyway, so it makes more sense to do
MethodInst-specific stuff in the MethodInst-specific function.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-24 08:44:05 -07:00
3545488ef8 Add TODO notice on a sorta-important scoping bug
Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-24 08:37:29 -07:00
fe586526df Fix get_attr_lazy
I had misunderstood/misused the ? suffix operator for the Option type.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-23 21:48:44 -07:00
9c4898ff8d Add base object function stubs(!)
Big step here, we have function stubs available for everybody. Most of
them panic. Each type will eventually have its own implementations for
different operators.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-23 21:34:10 -07:00
0d04090a99 Implement vtables and method resolution
Types now can have vtable elements which are used by instances to bind
themselves to methods. When Op::GetAttr is executed, it calls a new
function, Obj::get_attr_lazy. This will search:

* attributes on the object
* vtable on the object's type
* vtable on the object's type's type,
* etc.

This searches up the type tree for a named value. If it exists as an
attribute, it will be returned immediately. If it exists in the type's
vtable, then it will be inserted as an attribute. If the vtable value is
a function, the object that it is being called on will be bound to that
method as the `self` parameter.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-23 20:59:00 -07:00
8b931e9d12 Revamp object system, start using gc crate
Wow, what a ride. I think everything should be working now. In short:

* Objects use the `gc` crate, which as a `Gc` garbage-collected pointer
  type. I may choose to implement my own in contiguous memory in the
  future. We will see.
* The type system is no longer global. This is a bit of a burden,
  because now, whenever you want to create a new object, you need to
  pass its type object into the `Obj::instantiate` method, as well as
  its `::create` static method.
* This burden is somewhat alleviated by the `ObjFactory` trait, which
  helps create new objects as long as you have access to a `builtins`
  hashmap. So something that would normally look like this:

    fn init_builtins(builtins: &mut HashMap<String, ObjP>) {
        let print_builtin = upcast_obj(BuiltinFunctionInst::create(
            ObjP::clone(&builtins.get("BuiltinFunction").unwrap()),
            "print",
            print,
            1
        );
        builtins.insert("print".to_string(), print_builtin)
        // other builtins inserted here...
    }

  now looks like this:

    fn init_builtins(builtins: &mut HashMap<String, ObjP>) {
        let print_builtin = builtins.create_builtin_function("print", print, 1);
        builtins.insert("print".to_string(), print_builtin);
    }

(turns out, if all you need is a HashMap<String, ObjP>, you can
implement ObjFactory for HashMap<String, ObjP> itself(!))

Overall, I'm happier with this design, and I think this is what is going
to get merged. It's a little weird to be querying type names that are
used in the language itself to get those type objects, but whatever
works, I guess.

Next up is vtables.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-23 18:12:32 -07:00
24b06851c7 WIP: move mutability to be internal to the object instead of the pointer
I'm not super happy with this. But, the RwLock has been moved to the
`BaseObjInst::attrs` member. Although this is not exactly how it appears
in code, it basically does this:

    type Ptr<T> = Arc<RwLock<T>>;

    struct BaseObjInst {
        attr: HashMap<String, Ptr<dyn Obj>>,
        // etc
    }

becomes

    type Ptr<T> = Arc<T>;

    struct BaseObjInst {
        attr: RwLock<HashMap<String, ObjP>>,
        // etc
    }

This makes things a lot more ergonomic (don't have to use try_read() and
try_write() everywhere), but it also eliminates compile-time errors that
would catch mutability errors. This is currently rearing its ugly head
when initializing the typesystem, since `Type` needs to hold a circular
reference itself (which it already shouldn't be doing since it's a
reference-counted pointer!). Currently, all tests are failing because of
this limitation.

There are a couple of ways around this limitation.

The first solution would be just copying  all of the object
instantiation code into the `init_types` function and avoid calling
`some_base_type.instantiate()`. This would probably be literal
copy-pasting, or maybe an (ugly) macro, and probably a nightmare to
maintain long-term. I don't like this option, but it would make
everything "just work" with reference-counted pointers.

The second solution would be to write our own garbage collector, which
would allow for circular references and (hypothetically) mutably
updating these references. This is something that I am looking into,
because I really want a RefCell that you can pass around in a more
ergonomic way.

I think the fundamental error that I'm running into is trying to borrow
the same value multiple times mutably, which you *really* shouldn't be
doing. I believe I need to write better code and does the same thing.

The only unsolved problem is circular references. This is not a problem
right now because I'm not writing code that has circular references
besides the base typesystem (which is not a problem because they need to
live the entire lifetime of the program), but it will be a latent
problem until it gets fixed.

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-22 20:40:15 -07:00
16f3dc960c Base initial commit
Still WIP, working on object system still, which in Rust, makes me want
to kill myself

Signed-off-by: Alek Ratzloff <alekratz@gmail.com>
2024-09-20 16:04:30 -07:00