Info: (groff) Using Symbols

Info Catalog
groff: Font Positions
groff: Using Fonts
groff: Character Classes
groff: Using Symbols

 
 5.19.4 Using Symbols
 --------------------
 
 A "glyph" is a graphical representation of a "character".  While a
 character is an abstraction of semantic information, a glyph is
 something that can be seen on screen or paper.  A character has many
 possible representation forms (for example, the character 'A' can be
 written in an upright or slanted typeface, producing distinct glyphs).
 Sometimes, a sequence of characters map to a single glyph: this is a
 "ligature"--the most common is 'fi'.
 
    Space characters never become glyphs in GNU 'troff'.  If not
 discarded (as when trailing on text lines), they are represented by
 horizontal motions in the output.
 
    A "symbol" is simply a named glyph.  Within 'gtroff', all glyph names
 of a particular font are defined in its font file.  If the user requests
 a glyph not available in this font, 'gtroff' looks up an ordered list of
 "special fonts".  By default, the PostScript output device supports the
 two special fonts 'SS' (slanted symbols) and 'S' (symbols) (the former
 is looked up before the latter).  Other output devices use different
 names for special fonts.  Fonts mounted with the 'fonts' keyword in the
 'DESC' file are globally available.  To install additional special fonts
 locally (i.e., for a particular font), use the 'fspecial' request.
 
    Here are the exact rules how 'gtroff' searches a given symbol:
 
    * If the symbol has been defined with the 'char' request, use it.
      This hides a symbol with the same name in the current font.
 
    * Check the current font.
 
    * If the symbol has been defined with the 'fchar' request, use it.
 
    * Check whether the current font has a font-specific list of special
      fonts; test all fonts in the order of appearance in the last
      'fspecial' call if appropriate.
 
    * If the symbol has been defined with the 'fschar' request for the
      current font, use it.
 
    * Check all fonts in the order of appearance in the last 'special'
      call.
 
    * If the symbol has been defined with the 'schar' request, use it.
 
    * As a last resort, consult all fonts loaded up to now for special
      fonts and check them, starting with the lowest font number.  This
      can sometimes lead to surprising results since the 'fonts' line in
      the 'DESC' file often contains empty positions, which are filled
      later on.  For example, consider the following:
 
           fonts 3 0 0 FOO
 
      This mounts font 'foo' at font position 3.  We assume that 'FOO' is
      a special font, containing glyph 'foo', and that no font has been
      loaded yet.  The line
 
           .fspecial BAR BAZ
 
      makes font 'BAZ' special only if font 'BAR' is active.  We further
      assume that 'BAZ' is really a special font, i.e., the font
      description file contains the 'special' keyword, and that it also
      contains glyph 'foo' with a special shape fitting to font 'BAR'.
      After executing 'fspecial', font 'BAR' is loaded at font
      position 1, and 'BAZ' at position 2.
 
      We now switch to a new font 'XXX', trying to access glyph 'foo'
      that is assumed to be missing.  There are neither font-specific
      special fonts for 'XXX' nor any other fonts made special with the
      'special' request, so 'gtroff' starts the search for special fonts
      in the list of already mounted fonts, with increasing font
      positions.  Consequently, it finds 'BAZ' before 'FOO' even for
      'XXX', which is not the intended behaviour.
 
    ⇒Device and Font Description Files, and ⇒Special Fonts,
 for more details.
 
    The 'groff_char(7)' man page houses a complete list of predefined
 special character names, but the availability of any as a glyph is
 device- and font-dependent.  For example, say
 
      man -Tdvi groff_char > groff_char.dvi
 
 to obtain those available with the DVI device and default font
 configuration.(1)  (⇒Using Symbols-Footnote-1) If you want to use
 an additional macro package to change the fonts used, 'groff' (or
 'gtroff') must be run directly.
 
      groff -Tdvi -mec -man groff_char.7 > groff_char.dvi
 
    Special character names not listed in 'groff_char(7)' are derived
 algorithmically, using a simplified version of the Adobe Glyph List
 (AGL) algorithm, which is described in
 <https://github.com/adobe-type-tools/agl-aglfn>.  The (frozen) set of
 names that can't be derived algorithmically is called the "'groff' glyph
 list (GGL)".
 
    * A glyph for Unicode character U+XXXX[X[X]], which is not a
      composite character is named 'uXXXX[X[X]]'.  X must be an uppercase
      hexadecimal digit.  Examples: 'u1234', 'u008E', 'u12DB8'.  The
      largest Unicode value is 0x10FFFF. There must be at least four 'X'
      digits; if necessary, add leading zeroes (after the 'u').  No zero
      padding is allowed for character codes greater than 0xFFFF.
      Surrogates (i.e., Unicode values greater than 0xFFFF represented
      with character codes from the surrogate area U+D800-U+DFFF) are not
      allowed either.
 
    * A glyph representing more than a single input character is named
 
           'u' COMPONENT1 '_' COMPONENT2 '_' COMPONENT3 ...
 
      Example: 'u0045_0302_0301'.
 
      For simplicity, all Unicode characters that are composites must be
      maximally decomposed to NFD;(2) (⇒Using Symbols-Footnote-2)
      for example, 'u00CA_0301' is not a valid glyph name since U+00CA
      (LATIN CAPITAL LETTER E WITH CIRCUMFLEX) can be further decomposed
      into U+0045 (LATIN CAPITAL LETTER E) and U+0302 (COMBINING
      CIRCUMFLEX ACCENT).  'u0045_0302_0301' is thus the glyph name for
      U+1EBE, LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND ACUTE.
 
    * groff maintains a table to decompose all algorithmically derived
      glyph names that are composites itself.  For example, 'u0100'
      (LATIN LETTER A WITH MACRON) is automatically decomposed into
      'u0041_0304'.  Additionally, a glyph name of the GGL is preferred
      to an algorithmically derived glyph name; 'groff' also
      automatically does the mapping.  Example: The glyph 'u0045_0302' is
      mapped to '^E'.
 
    * glyph names of the GGL can't be used in composite glyph names; for
      example, '^E_u0301' is invalid.
 
  -- Escape sequence: \(nm
  -- Escape sequence: \[name]
  -- Escape sequence: \[base-glyph combining-component ...]
      Typeset a special character NAME (two-character name NM) or a
      composite glyph consisting of BASE-GLYPH overlaid with one or more
      COMBINING-COMPONENTs.  For example, '\[A ho]' is a capital letter
      "A" with a "hook accent" (ogonek).
 
      There is no special syntax for one-character names--the analogous
      form '\N' would collide with other escape sequences.  However, the
      four escape sequences '\'', '\-', '\_', and '\`', are translated on
      input to the special character escape sequences '\[aa]', '\[-]',
      '\[ul]', and '\[ga]', respectively.
 
      A special character name of length one is not the same thing as an
      ordinary character: that is, the character 'a' is not the same as
      '\[a]'.
 
      If NAME is undefined, a warning in category 'char' is produced and
      the escape is ignored.  ⇒Warnings, for information about the
      enablement and suppression of warnings.
 
      GNU 'troff' resolves '\[...]' with more than a single component as
      follows:
 
         * Any component that is found in the GGL is converted to the
           'uXXXX' form.
 
         * Any component 'uXXXX' that is found in the list of
           decomposable glyphs is decomposed.
 
         * The resulting elements are then concatenated with '_' in
           between, dropping the leading 'u' in all elements but the
           first.
 
      No check for the existence of any component (similar to 'tr'
      request) is done.
 
      Examples:
 
      '\[A ho]'
           'A' maps to 'u0041', 'ho' maps to 'u02DB', thus the final
           glyph name would be 'u0041_02DB'.  This is not the expected
           result: the ogonek glyph 'ho' is a spacing ogonek, but for a
           proper composite a non-spacing ogonek (U+0328) is necessary.
           Looking into the file 'composite.tmac', one can find
           '.composite ho u0328', which changes the mapping of 'ho' while
           a composite glyph name is constructed, causing the final glyph
           name to be 'u0041_0328'.
 
      '\[^E u0301]'
      '\[^E aa]'
      '\[E a^ aa]'
      '\[E ^ ']'
           '^E' maps to 'u0045_0302', thus the final glyph name is
           'u0045_0302_0301' in all forms (assuming proper calls of the
           'composite' request).
 
      It is not possible to define glyphs with names like 'A ho' within a
      'groff' font file.  This is not really a limitation; instead, you
      have to define 'u0041_0328'.
 
  -- Escape sequence: \C'xxx'
      Typeset the glyph of the special character XXX.  Normally, it is
      more convenient to use '\[XXX]', but '\C' has some advantages: it
      is compatible with AT&T device-independent 'troff' (and therefore
      Symbols-Footnote-3::)) and can interpolate special characters with
      ']' in their names.  The delimiter need not be a neutral
      apostrophe; see ⇒Delimiters.
 
  -- Request: .composite id1 id2
      Map special character name ID1 to ID2 if ID1 is used in '\[...]'
      with more than one component.  See above for examples.  This is a
      strict rewriting of the special character name; no check is
      performed for the existence of a glyph for either.  A set of
      default mappings for many accents can be found in the file
      'composite.tmac', loaded by the default 'troffrc' at startup.
 
  -- Escape sequence: \N'n'
      Typeset the glyph with code N in the current font ('n' is _not_ the
      input character code).  The number N can be any non-negative
      decimal integer.  Most devices only have glyphs with codes between
      0 and 255; the Unicode output device uses codes in the range
      0-65535.  If the current font does not contain a glyph with that
      code, special fonts are _not_ searched.  The '\N' escape sequence
      can be conveniently used in conjunction with the 'char' request:
 
           .char \[phone] \f[ZD]\N'37'
 
      The code of each glyph is given in the fourth column in the font
      description file after the 'charset' command.  It is possible to
      include unnamed glyphs in the font description file by using a name
      of '---'; the '\N' escape sequence is the only way to use these.
 
      No kerning is applied to glyphs accessed with '\N'.  The delimiter
      need not be a neutral apostrophe; see ⇒Delimiters.
 
    A few escape sequences are also special characters.
 
  -- Escape sequence: \'
      An escaped neutral apostrophe is a synonym for '\[aa]' (acute
      accent).
 
  -- Escape sequence: \`
      An escaped grave accent is a synonym for '\[ga]' (grave accent).
 
  -- Escape sequence: \-
      An escaped hyphen-minus is a synonym for '\[-]' (minus sign).
 
  -- Escape sequence: \_
      An escaped underscore ("low line") is a synonym for '\[ul]'
      (underrule).  On typesetting devices, the underrule is
      font-invariant and drawn lower than the underscore '_'.
 
  -- Request: .cflags n c1 c2 ...
      Assign properties encoded by the number N to characters C1, C2, and
      so on.
 
      Input characters, including special characters introduced by an
      Using Symbols-Footnote-4::) These properties can be modified with
      this request.  The first argument is the sum of the desired flags
      and the remaining arguments are the characters to be assigned those
      properties.  Spaces between the CN arguments are optional.  Any
      argument CN can be a character class defined with the 'class'
      request rather than an individual character.  ⇒Character
      Classes.
 
      The non-negative integer N is the sum of any of the following.
      Some combinations are nonsensical, such as '33' (1 + 32).
 
      '1'
           Recognize the character as ending a sentence if followed by a
           newline or two spaces.  Initially, characters '.?!' have this
           property.
 
      '2'
           Enable breaks before the character.  A line is not broken at a
           character with this property unless the characters on each
           side both have non-zero hyphenation codes.  This exception can
           be overridden by adding 64.  Initially, no characters have
           this property.
 
      '4'
           Enable breaks after the character.  A line is not broken at a
           character with this property unless the characters on each
           side both have non-zero hyphenation codes.  This exception can
           be overridden by adding 64.  Initially, characters
           '\-\[hy]\[em]' have this property.
 
      '8'
           Mark the glyph associated with this character as overlapping
           other instances of itself horizontally.  Initially, characters
           '\[ul]\[rn]\[ru]\[radicalex]\[sqrtex]' have this property.
 
      '16'
           Mark the glyph associated with this character as overlapping
           other instances of itself vertically.  Initially, the
           character '\[br]' has this property.
 
      '32'
           Mark the character as transparent for the purpose of
           end-of-sentence recognition.  In other words, an
           end-of-sentence character followed by any number of characters
           with this property is treated as the end of a sentence if
           followed by a newline or two spaces.  This is the same as
           having a zero space factor in TeX.  Initially, characters
           '"')]*\[dg]\[dd]\[rq]\[cq]' have this property.
 
      '64'
           Ignore hyphenation codes of the surrounding characters.  Use
           this in combination with values 2 and 4 (initially, no
           characters have this property).
 
           For example, if you need an automatic break point after the
           en-dash in numeric ranges like "3000-5000", insert
 
                .cflags 68 \[en]
 
           into your document.  However, this practice can lead to bad
           layout if done thoughtlessly; in most situations, a better
           solution instead of changing the 'cflags' value is to insert
           '\:' right after the hyphen at the places that really need a
           break point.
 
      The remaining values were implemented for East Asian language
      support; those who use alphabetic scripts exclusively can disregard
      them.
 
      '128'
           Prohibit a line break before the character, but allow a line
           break after the character.  This works only in combination
           with flags 256 and 512 and has no effect otherwise.
           Initially, no characters have this property.
 
      '256'
           Prohibit a line break after the character, but allow a line
           break before the character.  This works only in combination
           with flags 128 and 512 and has no effect otherwise.
           Initially, no characters have this property.
 
      '512'
           Allow line break before or after the character.  This works
           only in combination with flags 128 and 256 and has no effect
           otherwise.  Initially, no characters have this property.
 
      In contrast to values 2 and 4, the values 128, 256, and 512 work
      pairwise.  If, for example, the left character has value 512, and
      the right character 128, no break will be automatically inserted
      between them.  If we use value 6 instead for the left character, a
      break after the character can't be suppressed since the neighboring
      character on the right doesn't get examined.
 
  -- Request: .char c [contents]
  -- Request: .fchar c [contents]
  -- Request: .fschar f c [contents]
  -- Request: .schar c [contents]
      Define a new character or glyph C to be CONTENTS, which can be
      empty.  More precisely, 'char' defines a 'groff' object (or
      redefines an existing one) that is accessed with the name C on
      input, and produces CONTENTS on output.  Every time glyph C needs
      to be printed, CONTENTS is processed in a temporary environment and
      the result is wrapped up into a single object.  Compatibility mode
      is turned off and the escape character is set to '\' while CONTENTS
      is processed.  Any emboldening, constant spacing, or track kerning
      is applied to this object rather than to individual glyphs in
      CONTENTS.
 
      An object defined by these requests can be used just like a normal
      glyph provided by the output device.  In particular, other
      characters can be translated to it with the 'tr' or 'trin'
      requests; it can be made the leader character with the 'lc'
      request; repeated patterns can be drawn with it using the '\l' and
      '\L' escape sequences; and words containing C can be hyphenated
      correctly if the 'hcode' request is used to give the object a
      hyphenation code.
 
      There is a special anti-recursion feature: use of the object within
      its own definition is handled like a normal character (not defined
      with 'char').
 
      The 'tr' and 'trin' requests take precedence if 'char' accesses the
      same symbol.
 
           .tr XY
           X
               => Y
           .char X Z
           X
               => Y
           .tr XX
           X
               => Z
 
      The 'fchar' request defines a fallback glyph: 'gtroff' only checks
      for glyphs defined with 'fchar' if it cannot find the glyph in the
      current font.  'gtroff' carries out this test before checking
      special fonts.
 
      'fschar' defines a fallback glyph for font F: 'gtroff' checks for
      glyphs defined with 'fschar' after the list of fonts declared as
      font-specific special fonts with the 'fspecial' request, but before
      the list of fonts declared as global special fonts with the
      'special' request.
 
      Finally, the 'schar' request defines a global fallback glyph:
      'gtroff' checks for glyphs defined with 'schar' after the list of
      fonts declared as global special fonts with the 'special' request,
      but before the already mounted special fonts.
 
      ⇒Character Classes.
 
  -- Request: .rchar c ...
  -- Request: .rfschar f c ...
      Remove definition of each ordinary or special character C, undoing
      the effect of a 'char', 'fchar', or 'schar' request.  Those
      supplied by font description files cannot be removed.  Spaces and
      tabs may separate C arguments.
 
      The request 'rfschar' removes glyph definitions defined with
      'fschar' for font F.
Info Catalog
groff: Font Positions
groff: Using Fonts
groff: Character Classes