[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: HTML/XML parser
On Wed, 11 Jun 1997, Kasper Peeters wrote:
> Not quite. Most of the incorrect HTML files are incorrect because of
> missing end tags and incorrect nesting. Both can be handled and
> corrected without knowing anything else about the structure. I want to
> avoid cluttering the parser for correct HTML with tricks and guessing
> algorithms. Maybe look at it as a two-stage parser (both could be done
> by PCCTS).
>
> Kasper
What are the chances that html on the net will get better as time goes on?
If it's good then we could have the parser get feed html directly and if
it encounters bad html it could pass it to a 'Less effienct' 'bad html
parser'..