Hyracks supports several basic data types stored in byte arrays. The byte arrays can be accessed through objects referred to as pointables. The pointable helps with tracking the bytes stored in a larger storage array. Some pointables support converting the byte array into a desired format such as for numeric type. The most basic pointable has three values stored in the object.
In Apache VXQuery the TaggedValuePointable is used to read a result from this byte array. The first byte defines the data type and alerts us to what pointable to use for reading the rest of the data.
Fixed length data types can be stored in a set field size. The following outlines the Hyracks data type or custom VXQuery definition with the details about the implementation.
Data Type | Pointable Name | Data Size |
xs:boolean | BooleanPointable | 1 |
xs:byte | BytePointable | 1 |
xs:date | XSDatePointable | 6 |
xs:dateTime | XSDateTimePointable | 12 |
xs:dayTimeDuration | LongPointable | 8 |
xs:decimal | XSDecimalPointable | 9 |
xs:double | DoublePointable | 8 |
xs:duration | XSDurationPointable | 12 |
xs:float | FloatPointable | 4 |
xs:gDay | XSDatePointable | 6 |
xs:gMonth | XSDatePointable | 6 |
xs:gMonthDay | XSDatePointable | 6 |
xs:gYear | XSDatePointable | 6 |
xs:gYearMonth | XSDatePointable | 6 |
xs:int | IntegerPointable | 4 |
xs:integer | LongPointable | 8 |
xs:negativeInteger | LongPointable | 8 |
xs:nonNegativeInteger | LongPointable | 8 |
xs:nonPositiveInteger | LongPointable | 8 |
xs:positiveInteger | LongPointable | 8 |
xs:short | ShortPointable | 2 |
xs:time | XSTimePointable | 8 |
xs:unsignedByte | ShortPointable | 2 |
xs:unsignedInt | LongPointable | 8 |
xs:unsignedLong | LongPointable | 8 |
xs:unsignedShort | IntegerPointable | 4 |
xs:yearMonthDuration | IntegerPointable | 4 |
Some information can not be stored in a fixed length value. The following data types are stored in variable length values. Because the size varies, the first two bytes are used to store the length of the total value in bytes. QName is one exception to this rule because the QName field has three distinct variable length fields. In this case we basically are storing three strings right after each other.
Please note that all strings are stored in UTF8. The UTF8 characters range in size from one to three bytes. UTF8StringWriter supports writing a character sequence into the UTF8StringPointable format.
Data Type | Pointable Name | Data Size |
xs:anyURI | UTF8StringPointable | 2 + length |
xs:base64Binary | XSBinaryPointable | 2 + length |
xs:hexBinary | XSBinaryPointable | 2 + length |
xs:NOTATION | UTF8StringPointable | 2 + length |
xs:QName | XSQNamePointable | 6 + length |
xs:string | UTF8StringPointable | 2 + length |
For many string functions, we have used string iterators to traverse the string. The iterator allows the user to ignore the details about the byte size and number of characters. The iterator returns the next character or an end of string value. Stacking iterators can be used to alter the string into a desired form.
The array back value store is a key design element of Hyracks. The object is used to manage an output array. The system creates an array large enough to hold your output. Adding to the result, if necessary. The array can be reused and can hold multiple pointable results due to the starting offset parameter in the pointable.