XML 2007 Conference
Marriott Copley Place
Boston, Massachusetts, USA
3-5 December 2007
Add to your personal schedule

XML and XPath in the Wild

Stewart Taylor (Intel Corporation), Adam Lee (Stanford Univeristy)
XML and the Web Berkeley/Clarendon
Chair: Robin LaFontaine (DeltaXML)

This talk outlines our team’s findings on the properties of XML documents and XPath expressions “in the wild”.

As part of an ongoing effort to develop XML processing hardware and software, our team has collected thousands of samples of various XML species from the web to analyze in our lab. Dissecting these critters with various statistical tools, we developed a characterization of “typical” XML documents in each of some familiar species, including RSS and XTHML. Belaboring the metaphor further, we also cloned these species, taking the statistical characteristics and feeding them to a custom-designed tool for generating XML documents matching statistical profiles.

The talk with also describe a related investigation into XPath in which we extracted expressions from hundreds of open source projects. We found some illuminating patterns in XPath usage in those projects.

Stewart Taylor

Intel Corporation

Stewart Taylor is a software architect at Intel Corporation. In his many years at Intel, he has worked on numerous software projects in multimedia and information processing, most notably the Intel® Integrated Performance Primitives and the Intel® XML Software Suite. He is the author of Intel® Integrated Performance Primitives and Optimizing Applications for Multi-core Processors

Adam Lee

Stanford Univeristy

Adam has done various works in XML usage model framework and computer system design. His work includes developing B2B XML content level secured document sharing models, structural and statistical XML usage models, random XML document generation, embedded real-time data acquisition systems, database security, and scalable clustered database systems.

Adam holds a MS in Electrical Engineering form Stanford University, and BS in Engineering/BS in Economics in Computer Science and Finance from the School of Engineering and the Wharton School of University of Pennsylvania. He is currently a Senior Member of Technical Staff in Server Technology group of Oracle Corporation. Besides engineering work, Adam enjoys music and is a vocalist and a composer for classical music.

Your account


(?)

Premiere sponsor

Microsoft Interoperability

Platinum sponsors

JustSystems
DataDirect
IBM

Gold sponsors

Intel
Antenna House

Produced by

IDEAlliance

Event sponsor

RSuite CMS

Co-hosts

OASIS
Philly XML
XML Guild
Event software by Expectnation