Module bidi
Lua port of the reference implementation of the Unicode Bidirectional Algorithm (UAX #9).
Info:
- Copyright: 2016
- License: MIT
- Author: Deepak Jois <deepak.jois@gmail.com>
Class Paragraph
Paragraph.new (types, pairTypes, pairValues[, paragraphEmbeddingLevel=-1]) | Initialize a new paragraph. |
Paragraph:getLevels (linebreaks) | Return levels array breaking lines at offsets in linebreaks (Rule L1). |
Paragraph:getReordering (linebreaks) | Return reordering array breaking lines at offsets in linebreaks. |
Helper Functions
codepoints_to_types (codepoints) | Generate Bidi_Class property (directional codes) for each codepoint in the input array. |
codepoints_to_pair_types (codepoints) | Generate Bidi_Paired_Bracket_Type property for each codepoint in the input array. |
codepoints_to_pair_values (codepoints) | Generate an array of unique integers identifying which pair of brackets a bracket character belongs to. |
get_visual_reordering (codepoints[, dir=nil[, linebreaks=nil]]) | Generate a visual reordering of codepoints after applying the Unicode Bidirectional Algorithm. |
Constants
MAX_DEPTH | Embedding levels are numbers that indicate how deeply the text is nested, and the default direction of text on that level. |
Class Paragraph
- Paragraph.new (types, pairTypes, pairValues[, paragraphEmbeddingLevel=-1])
-
Initialize a new paragraph.
Initialize a new paragaph using several arrays of direction and other types and an externally supplied paragraph embedding level.
Parameters:
- types
Bidi_Class property (directional codes) for each character in the
original string. Codes must correspond to the values in the
luaucdn module.
This can be generated from the original input text using
codepoints_to_types
function. - pairTypes
t Bidi_Paired_Bracket_Type property for each character
in the original string. Codes must correspond to the values in the
luaucdn module.
This can be generated from the original input text using
codepoints_to_pair_types
function. - pairValues
array of unique integers identifying which pair of
brackets (or canonically equivalent set) a bracket character belongs to. For
example in the string
[Test(s)>
the characters(
and)
would share one value and[
and>
share another (assuming that]
and>
are canonically equivalent). Characters that have Bidi_Paired_Bracket_Typen
(None) may always get a single value like 0.This can be generated from the input text using
codepoints_to_pair_values
function. - paragraphEmbeddingLevel
The embedding level may be
0
(LTR),1
(RTL) or-1
(auto).-1
means apply the default algorithm (rules P2 and P3). (default -1)
- types
Bidi_Class property (directional codes) for each character in the
original string. Codes must correspond to the values in the
luaucdn module.
- Paragraph:getLevels (linebreaks)
-
Return levels array breaking lines at offsets in linebreaks (Rule L1).
The returned levels array contains the resolved level for each bidi code passed to the constructor.
Parameters:
- linebreaks The linebreaks array must include at least one value. The values must be in strictly increasing order (no duplicates) between 1 and the length of the text, inclusive. The last value must be the length of the text.
- Paragraph:getReordering (linebreaks)
-
Return reordering array breaking lines at offsets in linebreaks.
The reordering array maps from a visual index to a logical index. Lines are concatenated from left to right. So for example, the fifth character from the left on the third line is
para:getReordering(linebreaks)[linebreaks[1] + 4]
(
linebreaks[1]
is the position after the last character of the second line, which is also the index of the first character on the third line, and adding four gets the fifth character from the left).Parameters:
- linebreaks The linebreaks array must include at least one value. The values must be in strictly increasing order (no duplicates) between 1 and the length of the text, inclusive. The last value must be the length of the text.
Helper Functions
bidi.Paragraph
type.
- codepoints_to_types (codepoints)
-
Generate Bidi_Class property (directional codes) for each codepoint in the
input array.
Codes will correspond to the values in the luaucdn module.
Parameters:
- codepoints list of codepoints in the original input string.
- codepoints_to_pair_types (codepoints)
-
Generate Bidi_Paired_Bracket_Type property for each codepoint
in the input array.
Codes must correspond to the values in the luaucdn module.
Parameters:
- codepoints list of codepoints in the original input string.
- codepoints_to_pair_values (codepoints)
-
Generate an array of unique integers identifying which pair of brackets a
bracket character belongs to.
For example in the string
[Test(s)>
the characters(
and)
would share one value and[
and>
share another (assuming that]
and>
are canonically equivalent). Characters that have Bidi_Paired_Bracket_Typen
(None) may always get a single value like 0.Parameters:
- codepoints list of codepoints in the original input string.
- get_visual_reordering (codepoints[, dir=nil[, linebreaks=nil]])
-
Generate a visual reordering of codepoints after applying the Unicode
Bidirectional Algorithm.
This function can be directly called with the list of codepoints in the original string. It does the heavy lifting of calling all the other helper functions, applying sensible defaults, and removing unneeded characters from the visually re-ordered string.
Parameters:
- codepoints list of codepoints in the original input string.
- dir
The externally supplied direction – either
'ltr'
,'rtl'
ornil
(for auto). (default nil) - linebreaks
offsets in the codepoints array where line breaks must be applied.
When not provided, the default value is an array with a single offset beyond the range of the input text.
The values in the linebreaks must be instrictly increasing order (no duplicates) between 1 and the length of the text, inclusive. The last value must be the length of the text. (default nil)