SableCC generates a ton of classes to represent the AST in a type-strong fashion, and so I hacked away at the rather monstrous SableCC source code to make it generate classes that represent a graph instead of a tree. In addition to being a graph, they needs to support choices and sequences. At first, I made one ChoiceNode class that was generic, such that ChoiceNode<PExp> was a choice node that was guaranteed to unfold to a node of some subtype of PExp. It became apparent, however, that Java's type system is simply not strong enough to strongly represent this type of structure; or at least not in a way that would gain me any convenience. The amount of unsafe typecasts that were added because of the generics vastly outweighed the few they might spare me.
I then went a different route. Avoiding generic types, I generated separate classes to represent different types of choice nodes, so the class that unfolds to a PExp node is distinct from the class that unfolds to, say, a PStatement node. This is quite convenient, because any analysis will certainly have to treat such nodes very differently, and any type parameter would be invisible at runtime. A sequence of nodes is strictly distinguished from just a node, and as such, there are also separate choice nodes for PExp sequences, PStatement sequences, and so on. Sequences can be created using EmptySequence, SingletonSequence, and Concat nodes, of which there also exist one for each production in the AST.
These classes validate at runtime that their children have the correct type, and although short of being statically typed, the rigid structure of the AST graph has proven terrific for analysis. I believe any synthesized property in the AST can be approximated with a monotone backward analysis on the AST graph. Inherited properties can likewise be approximated with a montone forward analysis.
I managed to implement abstract type-checking for a simple expression-based language with integers and booleans in just 250 lines of Java code! For example, suppose we want to analyse the input:
No comments:
Post a Comment