g2 is a command line utility that provides a front-end to the [LibEth] character transcoding services. g2 comes with the [LibEth] source code but is undocumented. This Wiki page attempts to provide some useful documentation:
Prior to g2 there was a sera2any utility which collected together a number of separate sera2xyz utilities, the sera2any name in turn was inspired by the Mule any2ps utility. g2 accepts more than just [SERA] as input and the name is short for "Ge'ez To Any" (or geez2any).
Builing g2 assumes that you have already built and installed the LibEth library. g2 can be statically linked against the shared LibEth (typically libeth.so residing in /usr/lib or /usr/local/lib) which will reduce the executable size of g2. Alternatively, g2 can be compiled with g2 built into the executable by compiling against the static library (libeth.a). This will result in a larger g2 executable, but it can be copied between Linux systems (or Solaris systems, etc) and should still be able to run.
With a shared libeth
gcc -lm -leth gezXfer.c common.c tables.c -o g2
With a static libeth
gcc -lm /path/to/libeth.a gezXfer.c common.c tables.c -o g2
-lm links in the math library, needed only for the pow function. If LibEth was built to use its internal pow function then the math library is not required and the -lm flag may be ommitted.
-g can be added if you want to debug g2. Likewise LibEth must be compiled with the -g flag if your debugging session is to enter the libeth routines.
g2 [options] filein > fileout
By default [SERA] input and UTF-8 output are assumed. g2 will recognize the following command line switches, archiac switches are not documented here:
| Flag | Argument | Meaning |
|---|---|---|
| -fromdos | none | Remove DOS ^M (carriage return) at end of lines. |
| -todos | none | Insert DOS ^M (carriage return) at end of lines. Same as -tvout dos. Can not be used with -tvout options |
| -h | none | Help. Presently out of date. |
| -html | none | Indicates input document is HTML. HTML tags will not be transliterated. |
| -l | <iso-639-code> | Set language context for input/output, one of:
amh - Amharic ti - Tigrinya tir - Tigrinya gz - Ge'ez gez - Ge'ez la - Latin lat - Latin Tigrinya: a ⇒ ኣ ./g2 -l amh filein > fileout
|
| -i | <input-encoding> | Input encoding, one of: |
| -o | <output-encoding> | Output encoding, one of:
aausisa - Transliteration system in use at SISA. acis - acuwork - addis98 - addisword - addiswp - alpas - braille - Ethiopic Braille convention under Unicode Braille support. brana - Brana cbhale - CBHale encoding (multifont). dehai - Dehai email network's transliteration system. dejene - ed - "Ed" transliteration system used by the SIL "Ed" editor for Amharic. enhpfr - ethiome - EthioMicroEmacs encoding (multifont). ethiop - Ethiop transliteration. ethiome - ethiop - ethiopic - ethiosoft - ethiosys - EthioSystems encoding (multifont) ethiowalia - fidel - geez - geezab - geezbausi - geezedit - geezfont - geezinga - geeztypenet - Phonetic Systems Ge'ezTypeNet font encoding. ies - Institue of Ethiopian Studies transliteration system. image - Output as links to images (old ENH system). iso - ISO transliteration system for Ethiopic (old proposal). jis - Japanese Industrial Standard, used for Ethiopic before web browsers supported Unicode. jun - Short for "JUNET" - Japanese Unix Network encoding used in Mule. latex - LATEX encoding (if TEX support is enabled). mainz - Mainze University's transliteration system. mono - monoalt - nci - New Concepts Incorporated encoding. ncic - National Computer and Information Center encoding (used in Agafari fonts). ncic_et - NCIC modified encoding of the Ejji Tsihuf font. omnitech - OmniTech corporation's encoding. phonetic - Phonetic Systems encoding. powergeez - PowerGe'ez encoding. qubee - Qubee transliteration. sera - System for Ethiopic Encoding in ASCII tex - TEX encoding (if TEX support is enabled). tfanus - tfanusnew - visgeez - VisualGe'ez encoding. visgeez2k - VisualGe'ez 2000 encoding. wazema - Wazema encoding. |
| -tvin | Input encoding variant, or "secondary encoding", one of:
utf8 - UTF-8, 8-Bit UCS Transformation Format, the default with -i uniutf16 - UTF-16, 16-Bit UCS Transformation Format, or "two byte" Unicode. |
|
| -tvout | Output encoding variant, or "secondary encoding", one of:
Clike - Uppercase "C-Like" character escape: ካ ⇒ \x12AB. decimal - Decimal address value format: ካ ⇒ d4779. dos - Insert DOS ^M (carriage return) at end of lines. Same as -todos. Can not be used with -tvout options.escd - XML/HTML entity in decimal form: ካ ⇒ ካ. esch - XML/HTML entity in lowercase hexadecimal form: ካ ⇒ ካ. Esch - XML/HTML entity in uppercase hexadecimal form: ካ ⇒ ካ. java - Lowercase Java character escape: ካ ⇒ \u12ab. Java - Uppercase Java character escape: ካ ⇒ \u12AB. name - Lowercase Unicode character name: ካ ⇒ ethiopic syllable kaa. Name - Upperrcase Unicode character name: ካ ⇒ ETHIOPIC SYLLABLE KAA. uplus - Lowercase U+wxyz character escape: ካ ⇒ U+12ab. Uplus - Uppercase U+WXYZ character escape: ካ ⇒ U+12AB. utf7 - UTF-7, 7-Bit UCS Transformation Format utf8 - UTF-8, 8-Bit UCS Transformation Format, the default with -i uniutf16 - UTF-16, 16-Bit UCS Transformation Format, or "two byte" Unicode. xml - Lowercase XML tag character escape: ካ ⇒ <U12ab>. Xml - Uppercase XML tag character escape: ካ ⇒ <12AB>. zerox - Lowercase 0x character escape: ካ ⇒ 0x12ab. Zerox - Uppercase 0x character escape: ካ ⇒ 0x12AB. |
|
| -rtf | none | Make output in RTF. This feature works with the circa 1997 definition of RTF. |
| -s | none | Substitute Latin spaces with Ge'ez wordspace. |
| -S | <string> | Convert the string following the flag instead of reading from a file or stdin. Use quotation marks when multiple words are used and separated by space. This flag is useful for quickly looking up a character address. Example: ./g2 -tvout uplus -S ka ⇒ U+12ab./g2 -tvout java -S ka ⇒ \u12ab./g2 -tvout esch -S ka ⇒ ካ
|
| -stats | <output encoding> | Print tables of statistics in fidel.out and fidel2.out. This is likely broken but repairable |
| -u | none |
Make output UPPERCASE. This flag can be used with:
-clike-esch-java-name-uplus-xml-zerox./g2 -tvout Uplus ... and ./g2 -tvout uplus -u ... are identical.
|
| -v | none | Print version and exit. |
| -x | none |
Close string. When used with the -html option assures that, if approrpiate, a closing </font> tag
closes a block of text. This makes more sense when used with blocks of text through the perl interface and not on the
command line (I'm probably forgetting the use case here).
|
| -z -0 |
none | Treat ዐ (Ayn-Ge'ez) as 0 (Zero) in a numeric context, e.g: 1ዐ2, ዐ234, 12.ዐ5, 12,ዐዐ5. This was a common problem with Geezigna documents. |