viewvc-4intranet/misc/elemx/python/scanner.h

43 lines
1.1 KiB
C
Raw Normal View History

I've been screwing around for *days* trying to tweak this darned Java grammar. Forget it for now, and checkpoint what I've done. The elx-python and elx-java programs produce "element" text files which can then be consumed by elx_html.py to produce syntax-colored HTML output. elx_page.sh is a little wrapper around that to produce a full page. The Python stuff seems quite fine, and is blazing fast (the scanner runs over 100k lines/second). The Java grammar is horked, so it does work (the Java scanner seems fine; just the grammar). Lots more to do on this, but I'm out for the weekend, so this it's time to check point this. Maybe someone will have better ideas on how to fix the Java parser. Note that this stuff requires the 'msta' and 'shilka' programs from the 'cocom' toolkit. I still need to package this up nicely (some copyright/license notices, better makefiles, some minimal doc, etc). I also want to mess around with profiling the syntax coloring. It appears to be a bit slower than enscript right now, but it shouldn't be since we've already parsed the file by that point (gonna try binary element files, and some perf improvements to elx_html.py). General theory: write elx-* for any language needing cross-referencing capabilities (things that only need hiliting can continue to use enscript). The elx-* programs can be implemented in any fashion, as long as it produces the element description file. Python, Perl, pure C, automated scanner/parser, whatever. Future steps include using the element description file to get function names, to index them, and to feed that into the HTML generation. It would also be quite possible to feed the element descriptions right into a database and query from there. git-svn-id: http://viewvc.tigris.org/svn/viewvc/trunk@498 8cb11bc2-c004-0410-86c3-e597b4017df7
2002-03-15 04:54:28 +03:00
#ifndef SCANNER_H
#define SCANNER_H
#ifdef __cplusplus
extern "C" {
#endif /* __cplusplus */
/* constants and errors returned by the scanner */
enum
{
SCANNER_EOF = -1, /* returned by get_char_t and
scanner_get_token to symbolize EOF */
E_TOO_MANY_INDENTS = -100, /* too many indents */
E_DEDENT_MISMATCH, /* no matching indent */
E_BAD_CONTINUATION, /* character occurred after \ */
E_BAD_NUMBER, /* parse error in a number */
E_UNKNOWN_TOKEN, /* dunno what we found */
E_UNTERM_STRING /* unterminated string constant */
};
typedef int (*get_char_t)(void *user_ctx);
void *scanner_begin(get_char_t getfunc, void *user_ctx);
int scanner_get_token(void *ctx);
void scanner_identifier(void *ctx, const char **ident, int *len);
void scanner_token_range(void *ctx, int *start, int *end);
void scanner_token_linecol(void *ctx,
int *sline, int *scol, int *eline, int *ecol);
void scanner_end(void *ctx);
#ifdef __cplusplus
}
#endif /* __cplusplus */
#endif /* SCANNER_H */