viewvc-4intranet/misc/elemx/java/java.y

459 lines
7.5 KiB
Plaintext
Raw Normal View History

I've been screwing around for *days* trying to tweak this darned Java grammar. Forget it for now, and checkpoint what I've done. The elx-python and elx-java programs produce "element" text files which can then be consumed by elx_html.py to produce syntax-colored HTML output. elx_page.sh is a little wrapper around that to produce a full page. The Python stuff seems quite fine, and is blazing fast (the scanner runs over 100k lines/second). The Java grammar is horked, so it does work (the Java scanner seems fine; just the grammar). Lots more to do on this, but I'm out for the weekend, so this it's time to check point this. Maybe someone will have better ideas on how to fix the Java parser. Note that this stuff requires the 'msta' and 'shilka' programs from the 'cocom' toolkit. I still need to package this up nicely (some copyright/license notices, better makefiles, some minimal doc, etc). I also want to mess around with profiling the syntax coloring. It appears to be a bit slower than enscript right now, but it shouldn't be since we've already parsed the file by that point (gonna try binary element files, and some perf improvements to elx_html.py). General theory: write elx-* for any language needing cross-referencing capabilities (things that only need hiliting can continue to use enscript). The elx-* programs can be implemented in any fashion, as long as it produces the element description file. Python, Perl, pure C, automated scanner/parser, whatever. Future steps include using the element description file to get function names, to index them, and to feed that into the HTML generation. It would also be quite possible to feed the element descriptions right into a database and query from there. git-svn-id: http://viewvc.tigris.org/svn/viewvc/trunk@498 8cb11bc2-c004-0410-86c3-e597b4017df7
2002-03-15 04:54:28 +03:00
%token KR_abstract
%token KR_boolean KR_break KR_byte /* KR_byvalue */
%token KR_case /* KR_cast */ KR_catch KR_char KR_class /* KR_const */ KR_continue
%token KR_default KR_do KR_double
%token KR_else KR_extends
%token KR_false KR_final KR_finally KR_float KR_for /* KR_future */
/* %token KR_generic KR_goto */
%token KR_if KR_implements KR_import /* KR_inner */ KR_instanceof KR_int KR_interface
%token KR_long
%token KR_native KR_new KR_null
/* %token KR_operator KR_outer */
%token KR_package KR_private KR_protected KR_public
%token /* KR_rest */ KR_return
%token KR_short KR_static KR_super KR_switch KR_synchronized
%token KR_this KR_throw KR_throws KR_transient KR_true KR_try
%token /* KR_var */ KR_void KR_volatile
%token KR_while
%token TK_OP_ASSIGN TK_OPERATOR TK_IDENTIFIER TK_LITERAL
%token TK_DIM TK_INC_DEC
%start CompilationUnit
/* the standard if/then/else conflict */
/* %expect 1 */
%{
#include "elx.h"
void yyerror(const char *msg);
int yylex(void);
/* ### should come from an elx-python.h or something */
void issue_token(char which);
%}
%export {
/* the main parsing function */
int yyparse(void);
/* need to define the 'not found' in addition to the regular keywords */
#define KR__not_found 0
}
%%
TypeSpecifier
: TypeName
| TypeNameDims
;
TypeNameDims
: TypeName TK_DIM+
;
TypeNameDot
: NamePeriod
| PrimitiveType '.'
;
TypeName
: PrimitiveType
| NamePeriod TK_IDENTIFIER
| TK_IDENTIFIER
;
NamePeriod
: TK_IDENTIFIER '.'
| NamePeriod TK_IDENTIFIER '.'
;
TypeNameList
: TypeName / ','
;
PrimitiveType
: KR_boolean
| KR_byte
| KR_char
| KR_double
| KR_float
| KR_int
| KR_long
| KR_short
| KR_void
;
CompilationUnit
: PackageStatement [ImportStatements] [TypeDeclarations]
| ImportStatements [TypeDeclarations]
| TypeDeclarations
;
PackageStatement
: KR_package (TK_IDENTIFIER / '.') ';'
;
TypeDeclarations
: TypeDeclaration+
;
TypeDeclaration
: ClassDeclaration
| InterfaceDeclaration
;
ImportStatements
: ImportStatement+
;
ImportStatement
: KR_import TK_IDENTIFIER ('.' TK_IDENTIFIER)* [".*"] ';'
;
/*
QualifiedName
: TK_IDENTIFIER / '.'
;
*/
ClassDeclaration
: [Modifiers] KR_class TK_IDENTIFIER [Super] [Interfaces] ClassBody
;
Modifiers
: Modifier+
;
Modifier
: KR_abstract
| KR_final
| KR_public
| KR_protected
| KR_private
| KR_static
| KR_transient
| KR_volatile
| KR_native
| KR_synchronized
;
Super
: KR_extends TypeNameList
;
Interfaces
: KR_implements TypeNameList
;
ClassBody
: '{' FieldDeclaration* '}'
;
FieldDeclaration
: FieldVariableDeclaration
| MethodDeclaration
| ConstructorDeclaration
| StaticInitializer
;
FieldVariableDeclaration
: [Modifiers] TypeSpecifier VariableDeclarators ';'
;
VariableDeclarators
: VariableDeclarator / ','
;
VariableDeclarator
: DeclaratorName ['=' VariableInitializer]
;
VariableInitializer
: Expression
| '{' [ArrayInitializers] '}'
;
ArrayInitializers
: VariableInitializer ( ',' [VariableInitializer] )*
;
MethodDeclaration
: [Modifiers] TypeSpecifier MethodDeclarator [Throws] MethodBody
;
MethodDeclarator
: DeclaratorName '(' [ParameterList] ')' TK_DIM*
;
ParameterList
: Parameter / ','
;
Parameter
: TypeSpecifier DeclaratorName
;
DeclaratorName
: TK_IDENTIFIER TK_DIM*
;
Throws
: KR_throws TypeNameList
;
MethodBody
: Block
| ';'
;
ConstructorDeclaration
: [Modifiers] ConstructorDeclarator [Throws] Block
;
ConstructorDeclarator
: TypeName '(' [ParameterList] ')'
;
StaticInitializer
: KR_static Block
;
InterfaceDeclaration
: [Modifiers] KR_interface TK_IDENTIFIER [ExtendsInterfaces] InterfaceBody
;
ExtendsInterfaces
: KR_extends TypeNameList
;
InterfaceBody
: '{' FieldDeclaration+ '}'
;
Block
: '{' LocalVariableDeclarationOrStatement* '}'
;
LocalVariableDeclarationOrStatement
: LocalVariableDeclarationStatement
| Statement
;
LocalVariableDeclarationStatement
: TypeSpecifier VariableDeclarators ';'
;
Statement
: EmptyStatement
| LabeledStatement
| ExpressionStatement ';'
| SelectionStatement
| IterationStatement
| JumpStatement
| GuardingStatement
| Block
;
EmptyStatement
: ';'
;
LabeledStatement
: TK_IDENTIFIER ':' LocalVariableDeclarationOrStatement
| KR_case ConstantExpression ':' LocalVariableDeclarationOrStatement
| KR_default ':' LocalVariableDeclarationOrStatement
;
ExpressionStatement
: Expression
;
SelectionStatement
: KR_if '(' Expression ')' Statement [KR_else Statement]
| KR_switch '(' Expression ')' Block
;
IterationStatement
: KR_while '(' Expression ')' Statement
| KR_do Statement KR_while '(' Expression ')' ';'
| KR_for '(' ForInit ForExpr [ForIncr] ')' Statement
;
ForInit
: ExpressionStatements ';'
| LocalVariableDeclarationStatement
| ';'
;
ForExpr
: [Expression] ';'
;
ForIncr
: ExpressionStatements
;
ExpressionStatements
: ExpressionStatement / ','
;
JumpStatement
: KR_break [TK_IDENTIFIER] ';'
| KR_continue [TK_IDENTIFIER] ';'
| KR_return [Expression] ';'
| KR_throw Expression ';'
;
GuardingStatement
: KR_synchronized '(' Expression ')' Statement
| KR_try Block Finally
| KR_try Block Catches
| KR_try Block Catches Finally
;
Catches
: Catch+
;
Catch
: KR_catch '(' TypeSpecifier [TK_IDENTIFIER] ')' Block
;
Finally
: KR_finally Block
;
ArgumentList
: Expression / ','
;
PrimaryExpression
: TK_LITERAL
| KR_true | KR_false
| KR_this
| KR_null
| KR_super
| '(' Expression ')'
;
PostfixExpression
: PrimaryExpression Trailers
| TypeName AltTrailers
| TypeNameDot FollowsPeriod Trailers
| TypeNameDot DimAllocation TypeTrailers
| TypeNameDims TypeTrailers
| KR_new TypeName AltTrailers
| KR_new TypeNameDims TypeTrailers
;
DimAllocation
: KR_new TypeNameDims
;
PostfixDims
: TK_DIM+ '.' KR_class
;
FollowsPeriod
: KR_this
| KR_class
| KR_super
| KR_new TypeName NoPeriodsTrailer
;
NoPeriodsTrailer
: '[' Expression ']'
| '(' [ArgumentList] ')'
;
NoDimTrailer
: NoPeriodsTrailer
| '.' (FollowsPeriod | TK_IDENTIFIER)
;
AnyTrailer
: NoDimTrailer
| PostfixDims
;
Trailers
: (AnyTrailer | DimAllocation NoDimTrailer)* [DimAllocation]
;
AltTrailers
: NoPeriodsTrailer Trailers
|
;
TypeTrailers
: NoDimTrailer Trailers
|
;
CastablePrefixExpression
: PostfixExpression [TK_INC_DEC]
| LogicalUnaryOperator CastExpression
;
LogicalUnaryOperator
: '~'
| '!'
;
UnaryOperator
: '+'
| '-'
| TK_INC_DEC
;
/* note: we don't actually have grammar for a cast. we just rely on:
(expr) (argument)
as our parse match */
CastExpression
: /* '(' PrimitiveType ')' CastExpression
| '(' NamePeriod TK_IDENTIFIER TK_DIM TK_DIM* ')' CastablePrefixExpression
| '(' NamePeriod TK_IDENTIFIER ')' CastablePrefixExpression
| '(' TK_IDENTIFIER TK_DIM TK_DIM* ')' CastablePrefixExpression
| '(' TK_IDENTIFIER ')' CastablePrefixExpression
| */ UnaryOperator CastExpression
| CastablePrefixExpression
;
BinaryExpression
: CastExpression
| BinaryExpression TK_BINARY CastExpression
| BinaryExpression KR_instanceof TypeSpecifier
;
ConditionalExpression
: BinaryExpression
| BinaryExpression '?' Expression ':' ConditionalExpression
;
AssignmentExpression
: ConditionalExpression [AssignmentOperator AssignmentExpression]
;
AssignmentOperator
: '='
| TK_OP_ASSIGN
;
Expression
: AssignmentExpression
;
ConstantExpression
: ConditionalExpression
;
/*
TK_OPERATOR : OP_LOR | OP_LAND
| OP_EQ | OP_NE | OP_LE | OP_GE
| OP_SHL | OP_SHR | OP_SHRR
;
*/
TK_BINARY : TK_OPERATOR | '+' | '-' | '*'