Bug#1041158: perl: pod2man propagates duplicate space in the man page

Russ Allbery rra at debian.org
Sat Jul 15 20:18:52 BST 2023


Vincent Lefevre <vincent at vinc17.net> writes:

> pod2man propagates duplicate space in the man page (contrary to
> pod2text).

> For instance, /usr/sbin/pam_getenv contains

> This tool  will print out the value [...]

> with 2 space characters between "tool" and "will" (this is probably
> unwanted, but this shouldn't yield inconsistencies in their handling).

> The pod2text utility generates only one space:

> zira:~> pod2text /usr/sbin/pam_getenv | grep tool
>     This tool will print out the value of *env_var* from /etc/environment.

> but pod2man keeps both spaces, so that one gets them in the man page:
> "pod2man /usr/sbin/pam_getenv | man -l -" gives

> [...]
> DESCRIPTION
>        This tool  will print out the value of env_var from /etc/environment.
> [...]

The thing that's surprising to me in this is that *roff doesn't collapse
the spaces.  I feel like this is changed behavior and it used to do so,
although I'm not sure.

The root underlying issue here is unfortunately quite complex, and it's
very easy to break things in this area while trying to fix unfortunate
rendering such as that.  See the discussion in the Pod::Man manual page
under CAVEATS in the current version (which I think may only be in
experimental at this point):

  Sentence spacing
    Pod::Man copies the input spacing verbatim to the output *roff document.
    This means your output will be affected by how nroff generally handles
    sentence spacing.

    nroff dates from an era in which it was standard to use two spaces after
    sentences, and will always add two spaces after a line-ending period (or
    similar punctuation) when reflowing text. For example, the following
    input:

        =pod

        One sentence.
        Another sentence.

    will result in two spaces after the period when the text is reflowed. If
    you use two spaces after sentences anyway, this will be consistent,
    although you will have to be careful to not end a line with an
    abbreviation such as "e.g." or "Ms.". Output will also be consistent if
    you use the *roff style guide (and XKCD 1285 <https://xkcd.com/1285/>)
    recommendation of putting a line break after each sentence, although
    that will consistently produce two spaces after each sentence, which may
    not be what you want.

    If you prefer one space after sentences (which is the more modern
    style), you will unfortunately need to ensure that no line in the middle
    of a paragraph ends in a period or similar sentence-ending paragraph.
    Otherwise, nroff will add a two spaces after that sentence when
    reflowing, and your output document will have inconsistent spacing.

For various historical reasons, Pod::Text defaults to collapsing all
spacing to single spaces and you have to set the sentence option (the -s
flag) to get more equivalent behavior.  If you run pod2text -s on that
same man page, you will see the same problem.

The difference in defaults is partly historical accident and backwards
compatibility, but the more defensible justification is that *roff has its
own opinions about whitespace formatting and prefers two spaces after
periods, so it's much easier to get consistent output from *roff if
pod2man doesn't try to second-guess one space vs. two spaces.
(Determining whether a given double space is at the end of a sentence in a
way compatible with multiple languages is incredibly hard to do and is not
something I'm very enthused about trying to tackle in a Perl core module.)

-- 
Russ Allbery (rra at debian.org)              <https://www.eyrie.org/~eagle/>




More information about the Perl-maintainers mailing list