the initial size of intervals in interval-based labelling 
schemes which leave space for insertions. Whereas 
constructing labels in multiplicative labelling schemes 
which can easily cope with insertions and hybrid 
labelling schemes are computationally expensive and 
complex (Haw and Lee, 2011). For these reasons, 
prefix based labelling approaches appear to be more 
suitable for dynamic XML data (Sans and Laurent, 
2008). Therefore, this research concerns prefix 
labelling schemes where the labelling summarizes both 
the position of the node in the tree and also maintains 
the document order during updating.     
The first prefix labelling scheme which 
considered document order was introduced by 
(Tatarinov et al., 2002) and is called Dewey labelling 
scheme. It assigns integer labels based on the Dewey 
decimal classification system for libraries. Although 
this scheme is the most widely used (He, 2015) in 
XML query processing since it easily identifies the 
structural relationship between XML nodes, it does 
not support node insertion. 
Recently many prefix-based XML labelling 
schemes have been proposed in the literature to support 
node insertions amongst them the SCOOTER labelling 
scheme (O’Connor and Roantree, 2012). Unlike 
Dewey, SCOOTER labels are based on quaternary 
strings and represent node order lexicographically 
rather than numerically. However, like all dynamic 
labelling schemes, SCOOTER suffers from what is 
called the overflow problem in certain circumstances 
(Ghaleb and Mohammed, 2013). 
3  ENCODING METHODS 
A key factor for all dynamic XML labelling schemes 
is how their labels are physically encoded, decoded 
and stored in a computer. In the logical representation 
of prefix labelling schemes there is always a delimiter 
“.” but this delimiter is encoded and stored separately 
from the label value (Li et al., 2008). Therefore, the 
logical interpretation of a label in the computer 
immediately affects the label size on disk as well the 
computational cost of encoding/decoding between 
the logical and physical representations (O’Connor 
and Roantree, 2013).  
All existing dynamic labels storage schemes can 
be categorised into four classes: length fields, control 
tokens, separators, and prefix-free codes. 
3.1  Length Field 
Concept of a length field is a field to store the length 
of a node label (as a fixed length bit number) directly 
before the node label value. The length of labels can 
vary widely depending on the node’s position within 
the XML tree. Since XML trees are arbitrarily wide 
and arbitrarily deep there restriction on the number of 
nodes might be inserted later, as a consequence in a 
dynamic XML the number of node insertions is 
limited to the capacity of the fixed length field 
yielding to the overflow problem.  
3.2  Control Tokens   
Control tokens are tokens used to indicate the position 
of a label value within a specific-level interval and 
these tokens are used to determine how the 
subsequent bit sequence of the label value is 
interpreted. An example of control tokens is UTF-8 
(Yergeau, 2003) which is employed by the Dewey 
labelling scheme to encode Dewey labels, where each 
component of Dewey path is encoded in UTF-8 and 
then concatenated together in the same path order 
(Tatarinov et al., 2002). However, this encoding 
method causes overflow when a code value goes 
beyond 2
31
. 
3.3  Separator 
In prefix based labelling schemes a separator “.” is 
usually encoded and stored separately from the label 
itself. In a separator storage scheme a predefined bit 
sequence is reserved as a delimiter and not a part of 
the label value. For instance, the quaternary encoding 
QED (Li and Ling, 2005) and SCOOTER (O’Connor 
and Roantree, 2012) employed their own separator 
storage scheme in which the digit “0” is used only for 
separators and therefore the separator code size 
remain constant no matter how big the label size 
might become. This approach results in slow bit-by-
bit or byte-by-byte comparison operation during 
decoding because of the process needed to recognize 
bit “0” or “00” as a separator rather than the binary 
representation of the code itself. Consequently, it 
degrades query performance. 
3.4  Prefix-free Codes   
Prefix-free codes are based on the (Elias, 1975) 
proposition that a prefix set S is said to be a prefix 
code if and only if no member of S is the beginning 
of another. A prefix-free code approach often requires 
fewer bits to represent a label than a control token 
scheme since the prefix-free codes can be adjusted 
according to the number of members within a prefix 
set (Härder et al., 2007). An example of a dynamic 
labelling scheme that uses prefix-free codes is