| 5799276 | Knowledge-based speech recognition system and methods having frame length computed based upon estimated pitch period of vocalic intervals | August, 1998 | Komissarchik et al. | 704/251 |
| 6029195 | System for customized electronic identification of desirable objects | February, 2000 | Herz | 725/116 |
| 6224636 | Speech recognition using nonparametric speech models | May, 2001 | Wegmann et al. | 704/246 |
| 20050010412 | Phoneme lattice construction and its application to speech recognition and keyword spotting | January, 2005 | Aronowitz | 704/254 |
The present application is concerned with methods and apparatus for encoding lattice data.
Lattices are directed acyclic graphs comprising a number of nodes interconnected by directed links. An example of a system generating lattice data is a speech recognition application where the lattice data is utilised to represent large numbers of alternative hypothesis as a detected speech signal is processed. The lattice data generated by a speech recognition system can then be utilised as the input to another application. In the case of lattice data for speech recognition systems the nodes and links of the lattice are associated with other data identifying words or other speech units such as phonemes and probabilities so that the lattice data records the alternative hypotheses a detected signal might represent.
Where lattice data is generated, two types of lattice data can be identified. Firstly there is data associated with the individual links and nodes of the lattice which identifies the hypotheses the lattice represents. Secondly there is a lattice structure identifying how the nodes of the lattice are interconnected.
Lattice structures are commonly stored by storing link data in the form of start-node end-node data. More compact representations of link data can be achieved by storing links sorted by start node (so that only end node data need be stored) and by using node offsets rather than absolute node numbers. The main problem with this approach is that the resulting encoding does not necessarily bring out any regularities in the lattice. That is to say two visually identical substructures can result in two completely different symbol sequences. This is because the link data and node numbers (or offsets) are independent of the encoding of particular substructures. Compressing the data representing the lattice structure stored in this form is therefore difficult.
Depending upon pruning parameters, lattice structures generated by for example speech recognition systems can be very large. If large lattice structures are to be stored, large amounts of storage are required. If large lattice structures are to be transmitted across a network a large amount of transmission capacity is required. There is therefore a requirement to compress lattice structures efficiently both for storage and transmission.
A representation of lattice structures is therefore needed which has a higher potential for expressing equivalent structures using the same symbol sequence and hence is more susceptible to higher compression using conventional compression techniques.
An embodiment of the present invention will now be described, by way of example only, with reference to the accompanying drawings, in which:
FIG. 1 is a schematic block diagram of a document retrieval system embodying a lattice processor and a regeneration module in accordance with the present invention;
FIG. 2 is a schematic illustration of an exemplary lattice structure of a decoding by a speech recognition system of a signal representing the phrase “Taj Mahal drawing . . . ;”
FIG. 3 is a schematic block diagram of the functional units of the lattice processor of FIG. 1;
FIG. 4 is a flow diagram of an overview of the processing operations performed by the lattice processor of FIG. 1;
FIG. 5 is a flow diagram of the processing operations performed by the embedding module of the lattice processor of FIG. 1;
FIGS. 6A-6E are schematic illustrations of the processing of an exemplary lattice structure by the embedding module of the lattice processor of FIG. 1;
FIG. 7 is a flow diagram of the processing operations performed by the link encoding module of the lattice processor of FIG. 1;
FIGS. 8A-8H are schematic illustrations of the processing of the exemplary embedded lattice structure of FIG. 6E by the link encoding module of the lattice processor of FIG. 1;
FIGS. 9A and B are a flow diagram of the processing operations performed by the shape encoding module of the lattice processor of FIG. 1;
FIGS. 10A, B and C are schematic illustrations of the encoding of the exemplary embedded lattice of FIG. 8H by the encoding module of the lattice processor of FIG. 1;
FIGS. 11A-D are a flow diagram of the processing operations performed by the decoding module of the lattice processor of FIG. 1;
FIG. 12 is a flow diagram of the processing operations performed by the decoding module of the lattice processor of FIG. 1 to process link encodings;
FIGS. 13A-13F are schematic illustrations of the regenerations of a portion of an embedded lattice utilising link encoding data;
FIGS. 14A-14G are schematic illustrations of the regeneration of an exemplary embedded lattice of FIG. 6A by the decoding module of the lattice processor of FIG. 1; and
FIG. 15 is a flow diagram of the processing operations performed by the document storage unit of FIG. 1 in response to receipt of compressed data.
A specific embodiment of the present invention will now be described by way of example only.
Referring to FIG. 1, which is a schematic block diagram of a document retrieval system, a client computer 1 is provided connected to a document storage unit 2 via a network 3 . Also connected to the client computer 1 is a microphone 5 .
In this embodiment, the client computer 1 and document storage unit 2 each comprise programmable computers configured by reading computer instructions from a storage medium such as a disk 8 or by downloading instructions in the form of a signal 9 via the network 3 to become configured into a number of notional functional modules.
In this embodiment, the functional modules for the client computer 1 comprise: a speech recognition module 10 , a lattice processor 11 and a data compression module 12 . The functional modules of the document storage unit 2 comprise: a decompression unit 15 , a regeneration module 17 and a document retrieval unit 19 .
The speech recognition module 10 of the client computer 1 is arranged to receive electrical signals generated by the microphone 5 in response to detecting audio signals for example a phrase spoken by a user. The speech recognition module 10 processes the received electrical signals to generate a speech lattice representative of multiple hypotheses of what phrase the audio signal spoken by a user and detected by the microphone 5 represents.
By way of example, FIG. 2 is a schematic illustration of a lattice structure of a hypothetical combined phoneme and word decoding of the expression “Taj Mahal drawing”. In the figure nodes are represented by circles and the circles are arranged from left to right in order of increasing time. The links in the lattice structure of FIG. 2 are represented by arrows between the nodes. These links are each shown either associated with words represented by upper case letters or phonemes represented by lower case letters. In this embodiment the speech recognition module 10 is arranged to generate and store speech lattice data in a conventional form for example in the form: [node number, list of links from node, time represented by node, other data associated with node]. The link data for the speech lattice is also stored in a conventional manner, for example the form [start node, end node, data associated with link].
A speech lattice generated by the speech recognition module 10 can be very large. To avoid the transmission of large amounts of data via the network 3 , after a speech lattice has been generated the lattice processor 11 generates a compact representation of the lattice structure. As will be described in detail later, the compact representation is such to encode the same lattice substructures using the same symbol sequences and hence is particularly susceptible to being compressed using conventional data compression techniques.
Once a compact representation of the lattice structure has been generated this representation together with the data associated with links and nodes in the original generated speech lattice is then compressed using conventional data compression techniques by the data compression module 12 . The compressed data is then sent via the network 3 to the document storage unit 2 .
The document storage unit 2 using the decompression unit 15 and the regeneration module 17 processes the compressed data to regenerate the original speech lattice. The regenerated speech lattice is then passed to the document retrieval unit 17 which retrieves a document in response to the received speech lattice. This retrieved document is then sent back via the computer via the network 3 to the client computer 1 .
Structure of Lattice Processor
Referring to FIG. 3 which is a schematic block diagram of the functional components of the lattice processor 11 , in this embodiment the lattice processor 11 comprises: a data store 20 , an embedding module 22 , a link encoding module 24 , a shape encoding module 26 , a decoding module 28 and an output module 29 .
More specifically, in this embodiment the data store 20 is arranged to receive speech lattice data from the speech recognition module 10 . This speech lattice data comprising speech node data 30 and speech link data 32 is then stored in the data store 20 .
In this embodiment the speech node data 30 comprises data defining a speech lattice stored in a conventional form such as in the form of data in the form [a node number, a list of connected nodes, a time represented by the node and other data such as for example transition probabilities associated with nodes]. The speech link data 32 comprises data associating links in the lattice structure defined by the speech node data 30 with data identifying words or phonemes. This speech link data 32 is also stored in a conventional form such as data in the form [start node, end node, other data such as data identifying a word or phone or a transition probability associated with that link].
The embedding module 22 is arranged to process the speech node data 30 to generate timing data 34 identifying the different times represented by the nodes of the speech node data 30 . This timing data is then stored in the data store 20 . The embedding module 22 then as will be described in detail later generates a first representation of the lattice structure defined by the speech node data 30 . This representation is in the form of: an embedding table 36 , dummy links and nodes data 38 and cross links data 39 .
In this embodiment the representation of the lattice structure generated by the embedding module 22 is then processed by the link encoding module 24 . More specifically, the link encoding module 24 processes the representation of the lattice structure to cause the lattice structure to be simplified by removing some links and nodes from the lattice structure whilst generating link encoding data 40 which identifies the links and nodes removed from the lattice structure. This link encoding data 40 is also stored in the data store 20 .
After the lattice structure has been simplified by the link encoding module 24 , the shape encoding module 26 then processes the simplified lattice structure and generates a final encoding of the lattice structure. In this embodiment, this encoding in the form of a shape encoding 42 , a link list 43 and a node list 44 which are all stored within the data store 20 . In contrast to the representation of the original lattice structure by the speech node data 30 as will be described in detail later, the shape encoding 42 and link list 43 comprise data encoding the same lattice structure in a representation which causes equivalent structures to be encoded using the same symbol sequence and hence is susceptible to higher compression using conventional compression techniques.
After the shape encoding module 26 has generated the final encoding of the lattice structure, the decoding module 28 then processes the final encoding to determine an ordering of the links in the encoded lattice structure. This is achieved by the decoding module 28 , regenerating the encoded lattice structure and then determining an ordering of the links in the regenerated lattice structure.
When an ordering of links has been determined by the decoding module 28 , the output module 29 then outputs to the data compression module 12 : timing data 34 , data identifying dummy nodes and links 38 and cross links data 39 , generated by the embedding module 22 ; the shape encoding 42 and link list 43 generated by the shape encoding module 26 ; and the other data associated with nodes and links on the original speech lattice by the speech node data 30 and speech link data 32 respectively. The other data associated with nodes being ordered using the node list 44 generated by the shape encoding module 26 ; and the other data associated with links being ordered in an order determined by the decoding module 28 .
The data output by the output module 29 is then compressed by the data compression module 12 in a conventional manner so that the compressed data can be sent via the network 3 before being decompressed and the speech lattice regenerated by the document storage unit 2 .
A more detailed overview of the processing of the lattice processor 11 will now be described with reference to FIG. 4 which is a flow diagram of the processing performed by the lattice processor 11 .
Overview of Processing Performed by Lattice Processor
Initially (S 4 - 1 ) when a speech lattice represented by speech node data 30 and speech link data 32 is received by the lattice processor 11 and stored in the data store 20 , the embedding module 22 determines a first representation of the lattice structure represented by the speech node data 30 . This initial representation is in the form of timing data 34 , and an embedding table 36 stored in the data store 20 .
More specifically, when speech node data 30 is received by the lattice processor 11 , the embedding module 22 first generates timing data 34 comprising a list of all of the different times represented by the nodes of the speech node data 30 .
As previously been stated, the speech node data 30 identifies for each node in the speech lattice a time represented by the node. Thus for example in the case of the speech lattice shown in FIG. 2, these different timings are represented by the different left/right positions of the circles representing nodes in the exemplary lattice of FIG. 2.
As the speech node data 30 expressly associates time data with each node, this time data will be repeated where more than one node is associated with the same time. It is therefore possible to reduce the amount of data representing the times represented by the nodes by removing this repetition. More specifically, if a list of the different timing is generated, individual nodes can then be identified as being associated with a particular timing in the list. In this way where many nodes are associated with the same timing, this timing data is recorded once rather than many times by being explicitly included as time data within the speech node 30 data associated with each node.
After timing data 34 for a lattice structure has been determined and stored by the embedding module 22 in the data store 20 , an initial representation of the lattice structure encoded by the speech node data 30 is generated by the embedding module 22 .
In this embodiment, this initial representation comprises for each of the timings hereafter referred to as layers identified by times in the list of timing data 34 , a list of nodes which are associated by the speech node data 30 with those timings, together with copies of the data identifying pairs of start nodes and end nodes for each of the links in the lattice. This data is stored by the embedding module 22 as embedding data and link data within the embedding table 36 in the data store 20 .
FIG. 6A is an exemplary illustration of a lattice structure in which the nodes are represented by circles enclosing numbers and the links between nodes are shown in the form of arrows between the circles. In FIG. 6A the nodes sharing the same timing data are shown in the same horizontal position.
In the case of the lattice of FIG. 6A, the initial representation of the illustrated lattice structure generated by the embedding module 22 would be of the form of the following embedding data stored within the embedding table 36 :
| Layer No | List of Nodes |
| 1 | [0] |
| 2 | [1, 2] |
| 3 | [3, 4, 5] |
| 4 | [6] |
| 5 | [7, 8] |
| 6 | [9] |
Where layer number indicates the column in which nodes are contained and the list of nodes indicate the nodes in that column. In this example the link data comprises a list of data identifying a start node and end data for each of the arrows shown in FIG. 6A. Thus link data in the form links ( 0 , 1 ), ( 0 , 2 ), ( 1 , 3 ), ( 1 , 4 ), ( 1 , 5 ) etc would also be stored in the embedding table 36 .
As will be described in detail later the initial representation generated by the embedding module 22 is then processed so as to generate an alternative representation.
In the case of the lattice of FIG. 6A this alternative representation is illustrated by FIG. 6E. Comparing the structures of FIG. 6A and FIG. 6E, in contrast to FIG. 6A where some of the arrows (for example the arrows between nodes 3 and 7 and nodes 4 and 7 link nodes in columns which are not adjacent to one another, in FIG. 6E additional dummy nodes and dummy links have been added to the representation so that each node is connected to one or more nodes in an adjacent column. In FIG. 6E these additional nodes are shown by squares appearing in the lattice and the added links are illustrated by dotted lines.
Additionally compared with FIG. 6A, in FIG. 6E an extra layer has been added to the lattice so that the final layer contains a single node shown in FIG. 6E as a dummy node D 4 . The structure has also been altered by rearranging the orders of nodes within columns and removing a link (the link between link 5 and 6 ) so that the resultant structure shown in FIG. 6E contains no links which cross one another in the manner shown by links 2 - 3 and 1 - 5 in FIG. 6A. That is to say the links and nodes in the representation identify a planar graph, i.e. a graph which can be embedded in a plane without any cross links.
Data identifying the nodes and links which are added or removed as a result of processing by the embedding module 22 are stored by the embedding module 22 in the data store 20 as dummy links/nodes data 38 and cross links data 39 respectively.
Once a revised representation of the lattice structure has been determined by the embedding module 22 , the lattice processor 11 then proceeds to generate an encoding of this revised structure. This encoding is performed in a two stage process.
Initially (S 4 - 2 ) the link encoding module 24 causes some portions of the structure to be encoded in the form of data associated with links between nodes. Specifically as will be described in detail later, the link encoding module 24 encodes portions of the revised lattice structure which correspond to either a number of links connected to one another in series or by way of a number of parallel paths which are not connected to any other nodes in the lattice structure.
Thus for example in the case of the lattice structure illustrated by FIG. 6E, an encoding of the links and nodes between node 6 and node D 4 via nodes 8 and 10 of FIG. 6E is determined, as is an encoding of the parallel paths between node 1 and node 7 via nodes 4 and 5 of FIG. 6E.
FIG. 8H is an example of the simplified lattice structure corresponding to the lattice of FIG. 6E after linear and parallel paths have been encoded. In FIG. 8H, nodes are illustrated by circles and links between nodes are indicated by arrows. Associated with each of the links shown in FIG. 8H is a code which is written adjacent to the arrow. These codes are generated by the link encoding module 24 and stored as link encoding data 40 in the data store 20 .
After the link encoding module 24 has generated a simplified lattice structure and link encoding data 40 , the simplified lattice structure will as is illustrated by the lattice structure of FIG. 8H be a planar graph having no cross-links, the links and nodes of which define a number of interconnected areas. In the case of FIG. 8H, these areas are going from the left to the right in FIG. 8H a diamond formed by the nodes 0 , 2 , 3 and 1 , and their associated links a rhombus formed by the nodes 2 , 6 , 7 and 3 and their associated links, a triangle formed by the nodes 1 , 3 and 7 , and their associated links and a further triangle formed by the nodes 6 , 7 and D 4 and the links between those nodes.
The shape encoding module 26 then (S 4 - 3 ) proceeds to encode the lattice structure originally processed by the link encoding module 24 in the form of 3 lists of symbols.
In this embodiment these lists comprise a shape encoding 42 identifying the manner in which the areas defined by the simplified lattice generated by the link encoding module 24 link to one another, a link list 43 being an ordered concatenation of the link encoding data 40 associated with the links in the simplified lattice and an node list 44 identifying the order in which the node numbers appear in the simplified lattice and associated link encoding data.
As will become apparent when the generation of these lists are described in detail, these lists encode the lattice structure processed by the link encoding module 24 in a manner which causes the same sub structures to be encoded by the same symbol sequences and hence generates an encoding which is highly susceptible for data compression using conventional data compression techniques.
After these three lists have been created by the shape encoding module 26 and stored in the data store 20 , the shape encoding 42 , link list 43 and node list 44 are then (S 4 - 4 ) processed by the decoding module 28 . The result of the decoding by the decoding module 28 is to generate a further representation of the lattice structure represented by the speech node data 30 stored in the embedding table 36 of the data store 20 . The process of generating a lattice from the shape encoding 42 , link list 43 and node list 44 generated for the lattice structure of FIG. 6A will be described in detail later.
The final representation determined by decoding the shape encoding 42 , link list 43 and node list 44 for the structure of FIG. 6A is shown in FIG. 14G. Comparing the illustration of the lattice structures illustrated by FIG. 14G and FIG. 6A, it is visually possible to establish that the interconnections between links shown by FIGS. 6A and 14G are identical and the nodes are shown in the same columns, although the ordering of the representations of the nodes within the columns differs between the two illustrations. However, as the ordering within columns is not dependent upon the speech node data 30 , although the embedded lattices of FIG. 6A and FIG. 14G are not identical, they are representative of the same lattice structure.
After the actual embedded lattice structure encoded by the data stored in the data store 20 has been determined by the decoding module 38 , the output module 29 then (S 4 - 5 ) passes to the data compression module 12 the timing data 34 , the list of dummy nodes and dummy links 38 cross links data 39 and the shape encoding 42 and link list 43 which represent this final embedded representation as decoded by the decoding module 28 .
Additionally, as will be described later the output module 29 also passes to the output module 12 ordered lists of data associated with nodes and links by the speech node data 30 and speech link data 32 .
The data associated with nodes by the speech node data 30 is ordered in the order corresponding to the node list 44 stored in the data store 20 . The data associated with links by the speech link data 32 , is ordered based on the final lattice structure generated by the decoding module 28 for the speech lattice. For example in the case of the speech lattice of FIG. 13G, the link data might be ordered corresponding to the columns of links appearing from top to bottom and then from left to right. Thus, for example, initially the data for link 0 - 2 would be passed by the output module 29 to the decompression module 12 and then the link data for the following links in the order 0 - 1 , 2 - 6 , 2 - 3 , 1 - 3 , 1 - 4 , 1 - 5 etc, corresponding to the order of links in FIG. 14G.
As previously stated, the shape encoding data 42 and link list 43 for encoding the lattice structure expresses equivalent structures using exactly the same symbol sequence. This means that where many of the lattice substructures are similar, the data passed by the lattice processor 11 to the data compression module 12 can be efficiently compressed using conventional compression techniques so as to reduce the amount of data which needs to be transmitted via the network 3 .
When the compressed data is decompressed by the decompression unit 15 , the decompressed data is utilised to regenerate the original speech lattice which is then used to cause the document retrieval unit 19 to access a particular document stored in the document storage unit 2 .
The detailed processing of data by the embedding module 22 , the link encoding module 24 , the shape encoding module and the decoding module 28 will now each be considered in turn.
Processing Performed by the Embedding Module
The processing of the embedding module 22 of the lattice processor 11 will now be described in detail with reference to FIG. 5 and FIGS. 6A-E.
Initially (S 5 - 1 ) when a speech lattice in the form of speech node data 30 and speech link data 32 is received and stored in the data store 20 by the lattice processor 11 , the embedding module 22 then proceeds to generate timing data 34 identifying the individual times represented by nodes in the speech node data 30 .
More specifically, in this embodiment where speech node data 30 is stored in the data store 20 in the form of: [Node number, list of connecting nodes, timing of node, other data associated with node], the embedding module 22 initially processes the speech node data 30 to generate a list of all the timings represented by the nodes of the speech node data 30 . The embedding module 22 then processes the generated list to create an ordered list of timings. Duplicate timings represented in the list are then removed so that timing data 34 comprising a list of different times represented by different nodes by the speech node data 30 ordered in the order of the different timings is created. This list is then stored as timing data 34 in the data store 20 of the lattice processor 11 .
After a list of timings has been stored in the form of timing data 34 in the data store 20 , the embedding module 22 then (S 5 - 2 ) generates initial embedding data for the lattice structure represented by the speech node data 30 stored in the data store 20 .
Specifically the embedding module 22 determines for each of the timings hereinafter referred to as layers represented by the timing data 34 a list of nodes which have time data corresponding to the identified times. These lists are stored as embedding data in the embedding table 36 one for each time identified by the timing data list 34 . Additionally, for each of the links between nodes identified by the speech node data 30 , the embedding module 22 then generates and stores link data in the embedding table 36 comprising data identifying a start node and an end node for each link.
Thus, for example, in the case of the lattice structure illustrated by FIG. 6A where node numbers are illustrated by circles containing numbers and links are illustrated between arrows connecting the circles and where the nodes associated with same timings and hence in the same layer are shown in vertical columns above one another, the following data would be stored in the embedding table 36 .
| Embedding Data | |
| Layer No | List of Nodes |
| 1 | [0] |
| 2 | [1, 2] |
| 3 | [3, 4, 5] |
| 4 | [6] |
| 5 | [7, 8] |
| 6 | [9, 10] |
It will be appreciated that together the timing data 34 and embedding table 36 encode all the lattice structure of the speech lattice received from the speech recognition module 10 . Further the timing data 34 and embedding table 36 also implicitly records the data identifying the timings associated with each node.
Also, the direction of the links between nodes is also implicitly encoded in the data within the embedding table 36 . This is because the lists of nodes identified by different layers and hence different timings are ordered in timing order so that nodes representing earlier times appear in lists for the earlier layer numbers whilst nodes appearing for later timings appear in the lists for later layers in the embedding table 36 . It is therefore implicit that where a link is recorded from one node to another the direction of that link as shown by the arrows in FIG. 6A will be between the direction from the node for a layer for an earlier timing to the node for a layer for a later timing.
After the initial embedding for a lattice structure has been determined, the embedding module 22 then (S 5 - 3 ) proceeds to modify the initial stored data to add a number of additional nodes to the embedding table 36 and update the link data so that each link only connects from nodes of one layer to nodes in an adjacent layer and the list of nodes of the first and last layers comprise a single node.
Specifically the embedding module 22 proceeds to consider each of the links, stored in the embedding table 36 in turn. For each of the links the layer number of the start node is compared with the layer number of the end node of the link. Wherever the layer numbers for the start node and end node are separated by a single number, no further action is taken. Wherever a start node is linked to an end node separated by more than one layer, a dummy node is added to each of the intervening layers. The link data is then updated to replace the link data being processed with a series of items of link data identifying links between the start node and the end node in each layer via each of the added dummy nodes.
Thus, for example, in the case of the embedding illustrated by FIG. 6A, each of the links would be considered in turn. When the link between node 2 and node 6 was processed, node 2 would be identified as being in the second layer and node 6 would be identified in the fourth layer. The embedding module 22 would then proceed to add a dummy node in the list of nodes for the third layer and replace the link between node 2 and node 6 by two links, a link between node 2 and the added dummy node and a link between the added dummy node and node 6 .
More specifically, in the case of the lattice shown in FIG. 6A, the links between 2 and 6 , 3 and 7 and 4 and 7 would be identified as connecting nodes in layers separated by more than one layer. The embedding data stored in the embedding table 36 would then be updated as follows:
| Layer No | List of Nodes |
| 1 | [0] |
| 2 | [1, 2] |
| 3 | [3, 4, 5, D1] |
| 4 | [D2, D3, 6] |
| 5 | [7, 8] |
| 6 | [9, 10] |
Data identifying these dummy nodes is then stored by the embedding module 22 in the dummy links/nodes data 38 in the data store 20 .
The embedding module 22 then considers the list of nodes for the first and last layers of the data stored in the embedding table 36 . If the first layer contains more than one node an additional 0 layer is added containing a dummy node and dummy links are added between this initial dummy node and the nodes in the first layer.
The list of nodes of the final layer is then considered. Where more than one node appears in the final layer an additional layer is added to the embedding data containing a single dummy node and links to the added dummy node from each of the nodes in the final layer are added to the list of links. Data identifying all the added dummy nodes is then added to the dummy link/node data 38 within the data store 20 .
Thus, in the case of the example of FIG. 6A as the first layer contains a single node, node 0 , no dummy nodes are added. However, the final layer initially contains the nodes 9 and 10 . An additional final layer is then added containing a single dummy node. All of the nodes in the previously final layer in this case nodes 9 and 10 are then recorded as being connected to this final node in the extra layer.
FIG. 6B is a schematic illustration of the lattice of FIG. 6A after dummy nodes have been added to the lattice. The dummy nodes are shown in FIG. 6B by squares labelled D 1 , D 2 , D 3 and D 4 . The extra links added to the lattice are shown in FIG. 6B by dotted lines. As can be seen by FIG. 6B in the case of the lattice of FIG. 6B all of the arrows appearing in the Figure do not cross more than one column and the entire lattice has a single start node, node 0 on the left of the illustration and a single end node labelled D 4 on the right of the illustration.
After dummy nodes have been added to the lattice to ensure that links connect nodes of adjacent columns, and dummy nodes have been added to ensure that the lattice contains a single start node and a single end node, the embedding data in the embedding table 36 is then processed (S 5 - 4 ) to re-order the lists of nodes associated with each layer so that the lattice embedding represented by the embedding data in the embedding table 36 contains a limited number of links which cross one another.
In this embodiment the embedding module 22 processes the embedding data stored in the embedding table 36 using conventional techniques such as are described by Roberto Tamassia “Graph Drawing” “CRC Handbook of Discrete and Computational Geometry”, CRC Press, ed. Jacob Goodman and Joseph O'Rouke, 1997, which is hereby incorporated by reference.
More specifically, starting with the second layer, each of the nodes in the layer is initially assigned an order number corresponding to the ordering of the nodes in that layer. Thus in the case of the nodes of FIG. 6B, Node 1 would be assigned an order 1 and node 2 would be assigned an order 2 . The nodes in the next layer are then ordered on the basis of the average order value for the nodes in the previous layer linked to those nodes.
Thus in the case of FIG. 6B if nodes 1 and 2 are given values 1 and 2 respectively, node 3 which is linked to both node 1 and node 2 would be assigned a value of (1+2)/2=1.5. Nodes 4 and 5 which are only connected to node 1 would be assigned values of 1 respectively. Node D 1 which is only connected to node 2 would be assigned a value of 2.
The embedding module of 22 then proceeds to re-order the nodes in the next layer so that the nodes having the lowest order numbers associated with them appear at the head of the list and the node associated with increasingly higher numbers appear later on in the list.
Thus in the case of FIG. 6B, the list representing the column for the third layer would be re-ordered from the list [ 3 , 4 , 5 ,D 1 ] to become the list [ 4 , 5 , 3 ,D 1 ]. The next layer is then re-ordered in a similar way until the final node in the final layer is reached. The re-ordering of the nodes in each layer is then repeated starting from the second to last layer of nodes and re-ordering the immediately previous layer on the basis of order values determined from average values of order numbers identified by links from that last layer. This processing is repeated for the next previous layer until the first layer is reached.
FIG. 6C is an exemplary illustration of the lattice of FIG. 6B after the nodes have been reordered to minimise to crossing links. As can be seen from FIGS. 6B and 6C whereas in FIG. 6B the links between node 2 and node 3 and node 1 and nodes 4 and 5 all cross one another, in FIG. 6C only a pair of crossing links remain namely the links between node 3 and node D 2 and node 5 and node 6 .
After the embedding module 22 has reordered the nodes in the layers of the embedding data in the embedding table 36 , the embedding module 22 then (S 5 - 5 ) considers each of the links in turn and removes crossing links from the revised lattice and stores data identifying the removed links in the cross link data 39 stored within the data store 20 .
More specifically, each link in each layer is considered in turn, whenever the link is determined to cross any other link, a number representing the number of times that link crosses any other link is incremented by one. When data for all the links has been generated, each set of links between layers is considered in turn. Where any cross link exist, one link having the greatest number associated with it is removed and data identifying this link is stored in the cross link data 39 . The calculation of the number of crossing links in each layer is then updated and links are removed until no crossing links remain. In this way the crossing links which are removed and stored as crossing links 39 separately from the embedding in the embedding table 36 are selected so as to correspond to the links which cross the greatest number of other links. That is to say where one link crosses two other links, a single link will be removed rather than removing the two links which are crossed.
Referring to FIG. 6C, after the nodes in the layers are re-ordered to minimise the number of crossing links the embedding data for the example lattice of FIG. 6A will be reordered so as to be in the order represented by FIG. 6C. In FIG. 6C there is a single pair of crossing links being the link between node 3 and D 2 and the link between node 5 and 6 . As both of these links cross only one other link, one of these links is randomly selected and stored in the cross-link data 39 .
FIG. 6D is an illustrative example of the lattice of FIG. 6C after the link between node 5 and node 6 has been removed. As can be seen from FIG. 6D each of the arrows representing links connects two nodes in two adjacent columns and none of the links represented by the arrows crosses any of the other links represented by arrows.
After sufficient links have been removed from the link data in the embedding table 36 and stored as cross link data 39 , so that the lattice represented by the data in the embedding table 36 contains no cross-links, the embedding module 22 then (S 5 - 6 ) considers each node in turn except for the nodes in the first and last layers and determines whether that node is connected to nodes in both the previous and subsequent layers.
That is to say the embedding module 22 considers each node in turn and checks the link data within the embedding table 36 to establish whether the node being processed appears both as a start node and an end node in at least one of the links stored as link data within the embedding table 36 for the lattice. Wherever any such unconnected nodes are identified the embedding module 22 then adds link data to connect that node to other nodes in the lattice.
More specifically, each node except the node in the last layer is considered in turn. If a node is determined not to be represented as a start node by link data within the embedding table 36 the nodes adjacent to the unlinked node in the list of nodes in the layer are then considered.
Thus in the case of the lattice of FIG. 6D node 5 which is not a start node for any link would be identified and the two adjacent nodes, nodes 3 and 4 would be considered. The nodes connected by links having the adjacent nodes as start nodes are then identified. In the case of FIG. 6D the nodes in question will be dummy nodes D 2 and D 3 .
If both of these nodes are dummy nodes as they are in the case of FIG. D 6 , a new dummy node is added between these nodes in the embedding data for the layer including the two dummy nodes and a link between the unconnected node and the new dummy node is added.
The embedding module 22 then considers the nodes linked to the two dummy nodes. If these are also dummy nodes, an extra dummy node in the next layer is added between the two identified dummy nodes in that layer and a link between the most recently added dummy node and the new dummy node is added to the embedding data in the embedding table.
Eventually, in a layer connected to one of the neighbours of an unconnected node, a node which is not as a dummy node will be reached. When this layer is reached, the embedding module 22 then adds a link to the path connecting the unconnected node in the previous layer to the real node in the next layer.
After an unconnected node has been connected to the lattice where the unconnected node did not appear as a start node, similar processing is undertaken to add additional nodes and links to nodes which are identified as not being an end node apart from the single node appearing in the first layer. Whenever nodes and links are added to the data stored in the embedding table 36 data identifying the dummy nodes and dummy links added is stored as dummy links/node data 38 in the data store 20 .
FIG. 6E is a schematic illustration of the lattice of FIG. 6A after the lattice of FIG. 6A has been processed by the embedding module 22 . As can be seen from FIG. 6E in the processed lattice all links connect between nodes of adjacent layers and none of the links in the lattice of FIG. 6E crosses any other link.
This means that the paths connecting nodes implicitly identify the layer the nodes are associated with. This is because nodes in the first layer will be identified with paths from the first node one node long, whereas nodes in the second layer will be connected to the node in the first layer by a path two nodes long and in general nodes in the nth layer will be connected to the first node by a path containing n nodes.
At this stage of processing the lattice of FIG. 6A the following embedding data would be stored in the embedding table representing the structure of FIG. 6E together with link data identifying each of the links in the structure.
| Embedding Data | |
| Layer No | Nodes In Layer |
| 1 | [0] |
| 2 | [2, 1] |
| 3 | [D1, 3, 5, 4] |
| 4 | [6, D2, D5, D3] |
| 5 | [8, 7] |
| 6 | [10, 9] |
| 7 | [D4] |
The embedding module 22 then orders (S 5 - 7 ) the link data stored in the embedding table. Specifically, the embedding module 22 initially orders the link data, in the order in which their start nodes appear in the embedding data for different layers. The links having the same start node are then ordered by the order in which their end nodes appear is the embedding data.
Thus in the case of the links of FIG. 6E, the following ordered link data would be stored in the embedding table 36 :
Additionally stored within the data store 20 will be the following data identifying the dummy nodes added to the lattice of FIG. 6A: Dummy nodes: D 1 , D 2 , D 3 , D 4 , D 5 .
The following data would be stored as dummy link data: ( 5 , 7 ) and cross link data: ( 5 , 6 ).
This embedded lattice structure from which cross links and links crossing adjacent layers have been removed is then passed for processing by the link encoding module 24 which proceeds to simplify this generated lattice structure as will now be described.
Processing by the Link Encoding Module
The processing performed by the link encoding module 24 of the lattice processor 11 will now be described in detail with reference to FIGS. 7 and 8 A-H.
The processing of the link encoding module is such to process the embedding table 36 generated by the embedding module 22 and to simplify the lattice structure represented by that in the generated table 36 . Specifically, the processing of the link encoding module 24 is such to encode portions of the lattice structure generated by the embedding module 22 where portions of the lattice identify linear paths between nodes being nodes only connected to two other nodes and where multiple linear paths connect the same two nodes.
Referring to FIG. 7, when the embedding module 22 has generated and stored embedding data and link data in the embedding table 36 , the link encoding module 29 initially (S 7 - 1 ) generates initial link encoding data 40 .
Specifically, for each of the links identified by link data in the embedding table 36 , data comprising a single symbol, in this embodiment the letter E is stored as link encoding 40 within the data store 20 as link encoding for the identified link.
FIG. 8A is a schematic illustration of the lattice of FIG. 6E where each of the links is associated with the link encoding 40 comprising a single symbol E for each link. As will be described in detail later the subsequent processing of the link encoding module 24 is such to modify the link encoding 40 and data stored within the embedding table 36 to remove links and nodes from the lattice represented by the data within the embedding table 36 whilst modifying the link encoding 40 to identify where in the lattice the nodes and links have been removed.
After the initial link encoding data 40 has been stored, the link encoding module 24 then selects a first node for processing (S 7 - 2 ). In this embodiment the first node for processing is selected from the head of the list identifying nodes in the second to last layer of the lattice structure defined by the embedding data in the embedding table 36 .
In the case of the exemplary lattice of FIG. 8A, the node selected on the second to last layer would be node 10 .
After an initial node has been selected, the link encoding module 24 then (S 7 - 3 ) selects an initial link for processing. In this embodiment the links from the selected node are selected in the order in which the links appear in the link data stored in the embedding table having that node as a start node. In the present example of FIG. 8A as there is a single link from node 10 to node D 4 , it is this link which is first selected for processing.
Once a link has been selected for processing, the link encoding module 24 then (S 7 - 4 ) considers the end node of the link being processed. Specifically the link encoding module 24 determines whether the end node identified by the link appears only once as a start node and only once as an end node in the list of links in the embedding data stored in the embedding table 36 .
If this is the case then there is a single path which passes through that node and the link encoding module 24 then proceeds to modify the link encoding data 40 and the data within the embedding table 36 to encode the presence of that node on that single path (S 7 - 5 ).
Specifically where the link being processed comprises link data (N 1 ,N 2 ) where N 1 identifies the start node and N 2 identifies the end node of the link, and there is another single link (N 2 , N 3 ) having N 2 as a start node the link encoding module 24 proceeds to delete these items of link data from the embedding table 36 and replaces them with a single item of link data (N 1 , N 3 ), where the new link is included in the ordered link data in the position of the deleted link (N 1 ,N 2 ).
The link encoding module 24 then utilises the link encoding data for the two links which have been deleted to generate link encoding data for the new link.
Specifically the link encoding module 24 appends the link encoding data for the link (N 1 ,N 2 ) to the end of the encoding data for the link (N 2 ,N 3 ). The link encoding module 24 then adds to the head of the newly generated list of symbols either the letter D where the node N 2 corresponds to a dummy node as identified by data within the dummy nodes 38 or a symbol P and the number of the node identifying that the numbered node is being pushed from the representation in the embedding table. The link encoding module 24 then deletes the identified node N 2 from the list of nodes in the embedding data stored within the embedding table 36 .
Thus in the case of the links between node 8 and node 10 and node 10 and node D 4 in FIG. 8A after deleting the link data identifying the links between node 8 and node 10 and node 10 and node D 4 , new link data comprising data identifying a link between node 8