Closeup of the Text Alignment Tool

Text alignment refers to the work of taking two texts and indicating which elements in the first correspond to which elements in the second.

Miklal’s Text Alignment Tool is a software tool designed to produce gold-standard alignments. The software can be generalized to hand-curate any text-alignment data, whether a text and its translation, variant witnesses to a common work, or parallel passages.

If you are responsible for an existing text alignment that could use improvement, or are interested in producing a new alignment of texts in the same or different languages, Miklal can help. Please send us an email.

Contact Us

The Text Alignment Tool

Most existing aligned biblical texts suffer from a variety of common deficiencies:

  • Quality: An undesirably high number of errors in the text alignment
  • Consistency: An inconsistent method of aligning various grammatical structures
  • Detail: An undesirably low granularity of alignment, e.g. phrase instead of word, or word instead of morpheme

The Text Alignment Tool is designed to significantly improve text alignments in these areas, using clear visualizations, built-in consistency rules, tools for various comparisons, and algorithmic quality checks.

The following video provides a demonstration of the Text Alignment Tool and its functions.

Clear Visualization

The main panel of the Text Alignment Tool presents the alignment data in a clear way, so that is easy for a human aligner to understand the alignment and spot mistakes or inconsistencies in the data.

  • Language helps: Morphological information, English glosses, and past links help the aligner understand the Hebrew and decide how to align it to the English translation.
  • Blank rows: The blank rows are algorithmically inserted to maximize the number of horizontal lines and minimize the sum of the distances between all linked elements.
  • Colors and lines: Colored text and lines indicate a linking relationship between the two texts.
Main Panel of Text Alignment Tool

Consistency Rules

Case-specific consistency rules

Before beginning a text alignment project, case-specific rules are written for how various structures in the source language should be aligned to the various target-language structures used to translate them, illustrated in the image to the right.

 

All-Occurrences Viewer

The aligner can easily dig deeper using the Source Detailed pane, looking up all the occurrences of a given lexeme to check for consistency across passages.

All-Occurrences Viewer

Algorithmic Quality Checks

The software uses natural language processing and graph theory to check algorithmically for errors and inconsistencies.

  • Natural language processing
    • conformity to consistency rules
    • uncommon or improbable links
    • consistent treatment of n-grams
  • Graph theory
    • consistent primary status in grouped tokens

The following presentation provides more details about the algorithmic quality checking used in the Text Alignment Tool.


Learn More

The following slides, presented at the LaTeCH 2014 conference and the SBL 2015 conference respectively, provide more information about our work in text alignment.



Interested in our Text Alignment Tool?

If you are interested in producing a similar product using our Text Alignment Tool or adapting our tool for use in another text alignment project, please contact us.

Contact us