com.lowagie.text.pdf
Class BidiOrder

java.lang.Object
  extended by com.lowagie.text.pdf.BidiOrder

public final class BidiOrder
extends java.lang.Object

Reference implementation of the Unicode 3.0 Bidi algorithm.

This implementation is not optimized for performance. It is intended as a reference implementation that closely follows the specification of the Bidirectional Algorithm in The Unicode Standard version 3.0.

Input:
There are two levels of input to the algorithm, since clients may prefer to supply some information from out-of-band sources rather than relying on the default behavior.

  1. unicode type array
  2. unicode type array, with externally supplied base line direction

Output:
Output is separated into several stages as well, to better enable clients to evaluate various aspects of implementation conformance.

  1. levels array over entire paragraph
  2. reordering array over entire paragraph
  3. levels array over line
  4. reordering array over line
Note that for conformance, algorithms are only required to generate correct reordering and character directionality (odd or even levels) over a line. Generating identical level arrays over a line is not required. Bidi explicit format codes (LRE, RLE, LRO, RLO, PDF) and BN can be assigned arbitrary levels and positions as long as the other text matches.

As the algorithm is defined to operate on a single paragraph at a time, this implementation is written to handle single paragraphs. Thus rule P1 is presumed by this implementation-- the data provided to the implementation is assumed to be a single paragraph, and either contains no 'B' codes, or a single 'B' code at the end of the input. 'B' is allowed as input to illustrate how the algorithm assigns it a level.

Also note that rules L3 and L4 depend on the rendering engine that uses the result of the bidi algorithm. This implementation assumes that the rendering engine expects combining marks in visual order (e.g. to the left of their base character in RTL runs) and that it adjust the glyphs used to render mirrored characters that are in RTL runs so that they render appropriately.

Author:
Doug Felt

Field Summary
static byte AL
          Right-to-Left Arabic
static byte AN
          Arabic Number
static byte B
          Paragraph Separator
static byte BN
          Boundary Neutral
static byte CS
          Common Number Separator
static byte EN
          European Number
static byte ES
          European Number Separator
static byte ET
          European Number Terminator
static byte L
          Left-to-right
static byte LRE
          Left-to-Right Embedding
static byte LRO
          Left-to-Right Override
static byte NSM
          Non-Spacing Mark
static byte ON
          Other Neutrals
static byte PDF
          Pop Directional Format
static byte R
          Right-to-Left
static byte RLE
          Right-to-Left Embedding
static byte RLO
          Right-to-Left Override
static byte S
          Segment Separator
static byte TYPE_MAX
          Maximum bidi type value.
static byte TYPE_MIN
          Minimum bidi type value.
static byte WS
          Whitespace
 
Constructor Summary
BidiOrder(byte[] types)
          Initialize using an array of direction types.
BidiOrder(byte[] types, byte paragraphEmbeddingLevel)
          Initialize using an array of direction types and an externally supplied paragraph embedding level.
BidiOrder(char[] text, int offset, int length, byte paragraphEmbeddingLevel)
           
 
Method Summary
 byte getBaseLevel()
          Return the base level of the paragraph.
static byte getDirection(char c)
           
 byte[] getLevels()
           
 byte[] getLevels(int[] linebreaks)
          Return levels array breaking lines at offsets in linebreaks.
 int[] getReordering(int[] linebreaks)
          Return reordering array breaking lines at offsets in linebreaks.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

L

public static final byte L
Left-to-right

See Also:
Constant Field Values

LRE

public static final byte LRE
Left-to-Right Embedding

See Also:
Constant Field Values

LRO

public static final byte LRO
Left-to-Right Override

See Also:
Constant Field Values

R

public static final byte R
Right-to-Left

See Also:
Constant Field Values

AL

public static final byte AL
Right-to-Left Arabic

See Also:
Constant Field Values

RLE

public static final byte RLE
Right-to-Left Embedding

See Also:
Constant Field Values

RLO

public static final byte RLO
Right-to-Left Override

See Also:
Constant Field Values

PDF

public static final byte PDF
Pop Directional Format

See Also:
Constant Field Values

EN

public static final byte EN
European Number

See Also:
Constant Field Values

ES

public static final byte ES
European Number Separator

See Also:
Constant Field Values

ET

public static final byte ET
European Number Terminator

See Also:
Constant Field Values

AN

public static final byte AN
Arabic Number

See Also:
Constant Field Values

CS

public static final byte CS
Common Number Separator

See Also:
Constant Field Values

NSM

public static final byte NSM
Non-Spacing Mark

See Also:
Constant Field Values

BN

public static final byte BN
Boundary Neutral

See Also:
Constant Field Values

B

public static final byte B
Paragraph Separator

See Also:
Constant Field Values

S

public static final byte S
Segment Separator

See Also:
Constant Field Values

WS

public static final byte WS
Whitespace

See Also:
Constant Field Values

ON

public static final byte ON
Other Neutrals

See Also:
Constant Field Values

TYPE_MIN

public static final byte TYPE_MIN
Minimum bidi type value.

See Also:
Constant Field Values

TYPE_MAX

public static final byte TYPE_MAX
Maximum bidi type value.

See Also:
Constant Field Values
Constructor Detail

BidiOrder

public BidiOrder(byte[] types)
Initialize using an array of direction types. Types range from TYPE_MIN to TYPE_MAX inclusive and represent the direction codes of the characters in the text.

Parameters:
types - the types array

BidiOrder

public BidiOrder(byte[] types,
                 byte paragraphEmbeddingLevel)
Initialize using an array of direction types and an externally supplied paragraph embedding level. The embedding level may be -1, 0, or 1. -1 means to apply the default algorithm (rules P2 and P3), 0 is for LTR paragraphs, and 1 is for RTL paragraphs.

Parameters:
types - the types array
paragraphEmbeddingLevel - the externally supplied paragraph embedding level.

BidiOrder

public BidiOrder(char[] text,
                 int offset,
                 int length,
                 byte paragraphEmbeddingLevel)
Method Detail

getDirection

public static final byte getDirection(char c)

getLevels

public byte[] getLevels()

getLevels

public byte[] getLevels(int[] linebreaks)
Return levels array breaking lines at offsets in linebreaks.
Rule L1.

The returned levels array contains the resolved level for each bidi code passed to the constructor.

The linebreaks array must include at least one value. The values must be in strictly increasing order (no duplicates) between 1 and the length of the text, inclusive. The last value must be the length of the text.

Parameters:
linebreaks - the offsets at which to break the paragraph
Returns:
the resolved levels of the text

getReordering

public int[] getReordering(int[] linebreaks)
Return reordering array breaking lines at offsets in linebreaks.

The reordering array maps from a visual index to a logical index. Lines are concatenated from left to right. So for example, the fifth character from the left on the third line is

 getReordering(linebreaks)[linebreaks[1] + 4]
(linebreaks[1] is the position after the last character of the second line, which is also the index of the first character on the third line, and adding four gets the fifth character from the left).

The linebreaks array must include at least one value. The values must be in strictly increasing order (no duplicates) between 1 and the length of the text, inclusive. The last value must be the length of the text.

Parameters:
linebreaks - the offsets at which to break the paragraph.

getBaseLevel

public byte getBaseLevel()
Return the base level of the paragraph.



iText 2.1.7