[Shootout-list] Directions of various benchmarks

John Skaller skaller@users.sourceforge.net
Sat, 21 May 2005 14:45:16 +1000


On Wed, 2005-05-18 at 10:49 +0200, Pascal Obry wrote:
> Jos=C3=A9,
>=20
>  > It is usefull as a supported feature of the language yes/no.
>=20
> Yes, but don't see the point for the Shootout. We are trying to compare
> languages.=20

The ease and nature of bindings to C are a part of most practical
language designs, why not exhibit this? The performance of bindings
is also a concern. For example no one cares much about the cost
of binding to Gtk -- however numerical people care a LOT about
bindings to Fortran data types, because there are a lot of very=20
fast, even Standardised, Fortran libraries around=20
(LAPACK/BLAS for example) and to consider using C, or Ocaml,
the numerical programmer needs to know if Fortran compatible
arrays are supported. This directly influenced the design of
recent extensions to Ocaml to support unboxed big arrays
of floating types -- it basically gives a zero cost interface
to these Fortran codes from Ocaml.

However this is not meant to be 'open slather' for using C libraries
for everything, but rather that a couple of tests might be designed
to demonstrate a simple, and not quite so simple, C binding,
and measure the cost of using it.

> It is better to stick (when possible) to plain Ada or plain Eiffel
> or plain put-your-language-here otherwise most of the data will have no
> meaning. If everybody uses the same C regexp package we will end-up wit=
h the
> same speed, I don't think anybody will expect something else... great
> achievement !

Ah, but you're assuming the C library use is the *best* solution
and also the fastest. It may not be best because of the work involved,
it may not be fastest because the native language can do it easily
without paying the cost of a binding, and it may be that the C library
is intrinsically inferior in some circumstances to another solution,
such as Felix regexp facility generating a DFA whilst PCRE uses a messy
backtracking NFA-like model.

Another, trivial example is sorting: a C++ template is vastly superior
to C's quicksort -- it uses a better algorithm, AND, it avoids boxing
(pointers). FISh 1.6 can sort something like 40% faster than C,
using the same algorithm, just by avoiding boxing.

The real problem is defining what "plain put-your-language-here'
actually means. C++ naturally includes C as a 'subset' and
so does Felix -- and looking at some of the Eiffel and Ada binding
stuff it would seem at least two other implementations also treat
efficient and simple binding to C as an important concern.

Perhaps the 'border' between Eiffel and C, and between Ada and C
is clear, but between C++ and C and Felix and C++ and C, there
it is not so clear. (I will have to look at D again too ..)

I guess the point is, most languages are designed to use libraries,
and many ship without very much in their 'standard' for various
reasons, so it isn't entirely reasonably to just discount the
C binding ability of these systems, since often the intent is
to enhance the core system with C bindings.

In most of these systems, the 'core' data structures are
very often constructed exactly like an external C library,
even if originally they were not -- for example Python used
to have 'built-in' dictionaries, but today they are NOT
built-in, they're just another data structure: the only
real language support for the standard ones is that=20
the language has constructors (dictionary literals)
that build them, once built, they're manipulated just
like any other associative data structure -- by binding
dynamically using the object system.

Another example from the real world: many C++ programmers
simply assume 'boost' as part of their 'standard library'.
I would argue 'C++ without boost' and 'C++ with boost' are
distinct translators -- the set of 'extensions' a translator
is allowed to use is part of the specification of the translator.

Thus, gcc, and gcc + PCRE are two distinct translators: the first
cannot use gcc (it's an external library). The second can,
because it is *defined* as part of the translator.

To avoid a plethora of translators and external libraries ..
we just have to design tests that don't benefit from them.

--=20
John Skaller, skaller at users.sf.net
PO Box 401 Glebe, NSW 2037, Australia Ph:61-2-96600850=20
Download Felix here: http://felix.sf.net