Lauren’s Right Knee And XML

    May 2, 2006
    WebProNews Staff

Tim Bray co-edited the XML specification, and also crafted one piece of software called Lark, which was the first XML processor; until recently Bray had kept Lark under wraps.

Bray recounted the story of Lark in a recent blog post. He developed what became Lark and released it in December 1996.

During that time of development, Bray and his fiance, Lauren, traveled to Australia to get married. An unfortunate knee injury kept Lauren out of action for the remainder of their trip.

“So I broke out my computer and finished the work I’d already started on my XML processor,” Bray wrote, “and decided to call it Lark for Lauren’s Right Knee.”

Bray noted how Lark worked, and called it “a pure deterministic finite automaton (DFA) parser, with a little teeny state stack.” Lark worked well enough, and it worked very fast. “This was before the time of standardized XML APIs, but Lark had a stream API that influenced SAX, and a DOM-like tree API; both worked just fine,” Bray wrote.

However, Bray never built support for namespaces into Lark, and with the development of XML processors by an array of technology’s heavy hitters (IBM, Microsoft, etc) Lark faded into the background.

Then, O’Reilly author and standards activist Rick Jelliffe made it known he wanted to find a Finite State Machine for XML through the XML-dev mailing list. Bray noticed the request and passed Lark along to Jelliffe.

If he were so motivated, Bray believes he could do even more with Lark, he said in closing his post:

I bet if I went through and simply removed support for anything coming out of the , including all entity processing, then discarded the DOM stuff, then added namespace support and SAX and StAX APIs, it would be less than half its current size.

Then if I reworked the I/O, knowing what I know now and stealing some tricks that James Clark uses in expat, I bet it would be the fastest Java XML parser on the planet for XML docs without a DOCTYPE; by a wide margin. It’s hard to beat a DFA.

And it would still be fully XML 1.0 compliant.

Add to | DiggThis | Yahoo! My Web | Furl It

David Utter is a staff writer for WebProNews covering technology and business.