[Shootout-list] Directions of various benchmarks

Jon Harrop jon@ffconsultancy.com
Wed, 18 May 2005 18:19:53 +0100


On Wednesday 18 May 2005 17:20, Isaac Gouy wrote:
> --- Jos=E9 Bollo <jose.bollo@tele2.fr> wrote:
> > that would be a long debate, here is my humble opinion: relaxing the
> > 'same way' will improve the competition and then every implementer
> > will copy the algorithm of the fastest realisation. So the 'no same
> > way' will tend to 'same way'. BUT what will happen for languages
> > where
> > there is no competitor? Their performances will reduce in the time.
> > Not so good then because one has to watch each an other to be in.
>
> Correct - this results in a cycle of rewriting with different
> algorithms until programmers become exhausted or bored - it measures
> programmer effort.

I disagree with Jos=E9 and I partly disagree with Isaac here.

Every implementor will not be able to copy the algorithm of the fastest=20
implementation because not all languages can express algorithms sufficientl=
y=20
succinctly to be within the 100 LOC limit.

Regarding rewriting with different algorithms. Firstly, I think it is not=20
necessarily a bad thing to measure programmer effort to some extent on the=
=20
shootout as obscure and rarely used languages will not receive the same=20
effort as common languages, and it is useful to know which languages are=20
obscure and rarely used. Secondly, in my experience, the algorithm "search=
=20
space" is only inexhaustibly large in the case of medium to large programs=
=20
(i.e. >1000 LOC), so I don't think there will be a "problem" with radical n=
ew=20
algorithms in 100-line programs.

My ray tracer is a good example here. My original C, C++ and OCaml=20
implementations all use a scene tree, spherical bounding volumes and a=20
generic intersection routine. There are many variations on this theme (I've=
=20
tried most of the simple alterations):

1. Replace the scene tree with a direct array (e.g. a skip array).
2. Replace the explicitly stored scene tree with an implicit functional sce=
ne.
3. Tighten the spherical bounds.
4. Use some other bounding geometry (e.g. axis-aligned cuboids or a k-D tre=
e).
5. Have a specialised intersection routine for shadow rays which terminates=
=20
when any intersection is found, without trying to find the nearest=20
intersection.

The C implementation is already >100 LOC so none of these optimisations can=
 be=20
applied there. The C++ implementation is almost 100 LOC so some performance=
=2D=20
reducing simplifications have to be applied before more sophisticated=20
approaches can be added. The OCaml implementation has plenty of room and ca=
n=20
implement most of these proposed changes without even reaching 100 LOC.

IMHO, despite the artificial 100-LOC limit, the results accurately reflect=
=20
real-world programming. When writing my vector graphics library:

  http://www.chem.pwf.cam.ac.uk/~jdh30/programming/opengl/smoke/

the 8,000-LOC C++ implementation proved too verbose for me (a single=20
programmer) to overhaul in order to implement various important=20
optimisations. When I rewrote the program in OCaml, equivalent functionalit=
y=20
required 2,000-LOC and the latest version (16,000-LOC) contains a vast amou=
nt=20
of extra functionality which would be totally inaccessible to a single C++=
=20
programmer.

I had a similar experience when reimplementing part of Mathematica for Wolf=
ram=20
Research. My reimplementation of the lexer, parser and core interpreter was=
=20
800 lines of OCaml. Their original implementation was 100,000 lines of C. M=
y=20
JIT compiler was 2,000 LOC and implemented functionality completely=20
inaccessible to them (dozens of programmers) from C.

=2D-=20
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists