[xml/sgml-pkgs] Bug#3180: Dealing with ancient bug report #3180: linuxdoc-sgml semantics and formatting problems

Agustin Martin agmartin at debian.org
Wed Jan 7 18:25:02 UTC 2009


On Sun, Jun 02, 1996 at 02:26:00AM +0100, Ian Jackson wrote:
> Package: linuxdoc-sgml
> Version: 1.5-2
... 
>  To debian-devel and the package maintainer: I think that most of
>  these problems are upstream ones; some may require considerable
>  effort to fix, so I'm causing the bug system to CC the linuxdoc-sgml
>  mailing list.

Hi, Ian

Mailing you about this really ancient bug report, so we get your feedback
about what to do.

Note that linuxdoc tools is unmaintained upstream for some years. However,
is still actively used by the Linux Documentation Project and in a number of
other places (like in the Spanish TeX FAQ I maintain).

Because of not being maintained upstream, and because it has been frozen for
years, I do not think we should add more features to the 'raw' linuxdoc
features. However, any improvement in the way 'raw' linuxdoc files are
processed, including bug fixes, are welcome.

Some time ago, I started fixing some of the most (for me) annoying
linuxdoc-tools bugs found when dealing with the Spanish TeX FAQ. Initially
they were present in locally modified files (along with the associated bug
reports), but since there was no reply to the bug reports I started NMU'ing
some of my changes and finally put the package under control of the 'Debian
XML/SGML Group' with myself caring about the package.

As mentioned above, I do not intend to make deep changes in the package,
besides fixing bugs when possible and rewritting/reorganizing things in a
way I like more, so I only intend to have this package in a minimally
reasonable shape. 

> 1. There is no tag meaning `this is a metasyntactic variable in an
> example', a la <VAR> in HTML. 

This would have been fine for upstream some years ago, but I do not think
we should add more stuff to raw linuxdoc format at this time.

> 2. The <verb> tag doesn't appear to allow you to change font.  The
> <quote> tag doesn't appear to follow your own linewrapping.  This
> means that there's no way of getting an example laid out as you
> present it, but with metasyntactic variables in italics or angle
> brackets or whatever and typed text in a fixed-width font.

I am afraid this is a limitation of linuxdoc
 
> 3. The info formatting is screwed up if you have section headings with
> <tt> parts.  That info can't deal with `' in node names is a
> deficiency of info which the GNU people know about, but linuxdoc-sgml
> should work around it by stripping out the `' when they appear in a
> node name.

I improved this in 0.9.30. I hope most problems with ' are fixed now (there
is one more fix in experimental branch waiting for lenny release). There is
one more problem related to info backend, the lack of an
paragraph/subparagraph equivalent.

> 4. There are unpredictable gaps around <tscreen><verb> blocks.  There
> should either be no gap (I would prefer this, or at least like to have
> a way to get it) so that you could use them in a running paragraph
>               like this
> or there should consistently be a gap on both sides or only
> afterwards.  (By unpredictable I mean that info produces a gap after,
> but HTML a gap on both sides.)  Whether or not there is supposed to be
> a gap should be documented; if some of the output formats can't do it
> (eg HTML) then fair enough.

For the text format, last maintainer added an optional limit on the
number of allowed consecutive empty lines. Far from optimal, but at least
a bit better than before. In the bad side, this also affects to empty
lines in <tscreen...similar envs>.

> 5. The cross-referencing mechanism is rendered so differently in all
> the output formats that I tried that it's practically impossible to
> think of text that's suitable for putting around the <ref...> and in
> the name="...".  I'd like to see some markup with which I could
> produce:
>   see also <A href="...">section 2.3 on wombats</A>.  (HTML)
>   see also section 2.3 on wombats (page 23).          (PostScript)
>   see also section 2.3 on wombats (line 200).         (Text)
>   see also *Note: wombats.                            (Info)
> Perhaps I can already do this.  If so then I'd like to see it
> documented :-).

I am afraid the default behavior is more like the info output, unless you
hardcode more info.

> 6. There are several other oddities with the info formatting.
> (a) Right at the start of programmer.info I see:
>     \input texinfo

For us this is not needed, but texinfo format manual says that this should
be there (I guess only needed when calling texinfo itself)

> (b) All the bits where I said <em> have come out with `' round them.
>     I think something like *this* or _this_ would be more
>     appropriate.

Currently using @emph, so this should be fixed.

> (c) List headings (<descrip> <tag> HERE </tag> have `' round them too.
>     Is this really appropriate ?

They are rendered in .texi as @table + @item, and is makeinfo who puts
the `'. I do not see an easy alternative.

> (d) The filename heading before the first node lists a file in /tmp.
>     I don't know if this is fixable, but it would be nice if it said
>     that the thing came originally from programmer.sgml or whatever.

Output file can be postprocessed to change this info. Unfortunately makeinfo
is too rigid to do this when first creating the info file. Not sure if this
worths, but is something I do not discard.

> (f) When formatting using sgml2info I see:
>     Making info file `programmer.info' from `/tmp/sgml2info15281tmp2'.
>     /tmp/sgml2info15281tmp2:73: Misplaced `{'.
>     /tmp/sgml2info15281tmp2:75: Misplaced `}'.

I do not see this error any longer, may have been fixed by some of the
previous changes.

> 7. There are some oddities with the text formatting.
> (a) The table of contents has some mangled entries, possibly it too
>     cannot cope with <tt> inside a heading.

May be, I rewrote the text backend and improved toc generation. Seems that
things are working better now, although with a single font.

> (b) The title page/screen/thing is quite cheesy.  A bit more
>     whitespace, and having some of it centred and/or underlined (ie
>     with a seperate row of characters) would be better.
> (c) <em> doesn't show up at all.  How about *this* ?

groff escapes are currently used.

> (d) Section headings come *after* the relevant paragraph !  They also
>     don't stand out at all.  Perhaps this is due to my failure to use
>     <P> after section headings.  Why can't the SGML parser infer this
>     ?  I thought SGML was supposed to be good at that kind of thing.

Yes. This is documented in the manual.

> (e) Footnotes have a rather odd format; it looks deliberate, though.
>     I presume there's no better way to do this ?  It would be good at
>     least to leave footnotes until after the end of the paragraph.

That seems equally confusing, and the alternative of leaving footnotes to
the end of the text seems even worse.

> (f) Cross-references come out with ``'' around them.  Why ?

I guess was put there to better signal the reference, even if no groff
escapes are used.

> (g) Using <tscreen> inside a <descrip> resets the indenting.  I
>     presume this is very hard to fix.

I am afraid so. But I am not very fluent with groff, so help is welcome.

> (h) The existence of a paragraph break after <descrip> entries is
>     unpredictable.

At the first look seemed that groff gets confused about consecutive multiple
<p> and empty lines. After a deeper look, my guess is that this is a
page-break related problem, seems that with current code groff is breaking
pages in a supposedly continuous text output, ans things are joined in a
page-break.

The workaround is to play with the numbers of <p> or empty line. Really
suboptimal. I wish I could tell groff that this is a continuous text.

> 8. There was one oddity with the HTML formatting: my section headings
> come out with the following paragraph attached.  Interestingly, I can
> see from the HTML that is generated that it has spotted the end of the
> heading, but it has only put the </A> there and not the </H2>.

I guess you mean #182775, which should have been fixed in 0.9.21-0.6
 
> I hope noone minds me making these comments.  I'm trying to be helpful
> and constructive by supplying bug reports :-).

No problem at all from my (limited) side. Sorry for my reply taking so
long, I tried to look at most elements of your really exhaustive bug
report before actually replying.

So, you at lest now know which is the actual status of this bug report.
Comments on what to do, suggestions, patches and similar are of course
welcome. If so, please look first at the experimental branch in the git
repo, where the most up to date code resides.

Regards,

-- 
Agustin





More information about the debian-xml-sgml-pkgs mailing list