Apache VXQuery will be a standards compliant XML Query processor implemented in Java. The focus is on the evaluation of queries on large amounts of XML data. Specifically the goal is to evaluate queries on large collections of relatively small XML documents. To achieve this queries will be evaluated on a cluster of shared nothing machines.
There are lots of large collections of relatively small documents like e.g. the EDGAR dataset or the OpenStreetMap dataset. However we are not aware of open source XQuery processors available today that are capable of processing these datasets in parallel and making the contained information accessible.