[Shootout-list] Directions of various benchmarks

Bengt Kleberg bengt.kleberg@ericsson.com
Fri, 27 May 2005 09:51:15 +0200


On 2005-05-25 14:28, John Skaller wrote:
...deleted
> I have some basic code now that does some benchmarking,
> mainly to check that my Felix optimisation work is going ahead:
> the whole thing including benchmarking and web are fairly simple,
> but I'm finding some extensions vital already: 
> 
> * I have to compare several Felix versions
> * I must be able to handle multiple test machines
> * I need to limit the testing time
> * I need to know how accurate the results are

the current shootout does the time limitiation bit on a per test basis. 
do you mean the same thing here, or do you mean on a global scale (ie 
run these tests on these languages in no more than x hours, adjusting 
all ''n'' to fit the time).


> My basic idea for generalisation is a 'test record',
> which reports the result of a single test, and contains
> identifying information like:
> 
> * datetime
> * test machine key
> * test key
> * test argument
> * result

do you mean result as in the current 3 metrics (time, memory-usage and 
loc) or result as in the result of a test?
is this text or binary records?


> and to have a huge list of these to which results
> from many sources can be aggregated over time. 
> 
> Auxilliary tables include a description of the
> test machines by key: processor, memory, cache size,
> speed, hostname, etc.
> 
> A test consists of a source key (which identifies 
> the code to execute), and a translator key,
> which idenfies the translator, a script
> to build the test, and a test to run it (which 
> accepts the argument).

ok with this, apart from the last item. what do you men with ''a test to 
run it''?
moreover, you might want to have the build depending upon the translator 
key and the test key (since the shootout has different build flags (and 
test flags) for different translators and tests.


> With this 'database-like' kind of design, there are
> three jobs -- running benchmarks, analysing the
> results, and displaying them.

...deleted

> I also need multiple architecture results. One immediate
> reason is: ackermann's function with Felix is now:
> 
> * the same speed as Ocamlopt and gccopt on AMD64
> * FASTER than gccopt on x86
> 
> Anyhow, I wonder if refactoring the Shootout into
> 3 separate processes as indicated above, with
> documented data structures interconnecting them,
> would make sense. In particular, to run the tests
> and get a single 'set' of output, which is mergable
> with other sets and can be fed into a statistical
> analyser to generate data for the web site, or
> other display mechanism.

sounds ok to me. i think it ought to be very simple to get the latest 
results of all languages and all tests from the system since i suppose 
that is the number one usage of this data.


one thing i consider important is to have a per language based 
configuration. i would appreciate if i could add flags for a new 
language without editing a file with lots of other language flags in it.



bengt