http://savannah.nongnu.org/projects/phtml-parse/

phtml_parse is a C library designed for parsing ASP and PHP files into
their component parts, in particular identifying which bits are PHP
(or perl) code and which bits are HTML text.  It's designed for
compatibility with Perl's Apache::ASP module or with PHP version 4.

The rules used have had to be determined empirically using the test
files included in the tests directory.  The documentation has also
been consulted, but lack of clarity (can a START or END tag occur
during quoted text in the SCRIPT??) means that not everything is
documented.  When writing scripts, it's good to stick carefully within
the documented behavior since that's much easier to guarantee between
different implementations.  

There are currently two interfaces defined.  The first is the standard
C interface (see phtml_parse.h).  The second is the perl interface as an
XSUB.  The perl interface is distributed separately.

There is one example program included in the distribution called
asp_parse (source in phtml_parse_demo.c).  This just parses a file
given on its stdin and outputs the parsed contents with each parsed
chunk preceeded by a line saying whether it is script text or tag.

Bug Reports: please report bugs on savannah.nongnu.org


