Code center > Bugs and fixes

League Problems

<< < (15/19) > >>

Griz:

--- Quote from: Light ---Surely if it is a statistical draw after x rounds it should count as a loss to the challenger and if they are entered in their current order in the leagues you should get the same result? or am I missing something?
--- End quote ---
yes.  

there is no 'statistical draw' as it stands now ...
if there is no 'statistical decision' ... it goes on and on and on and on ...
never ending.
that's one point ... how long do we let it do so before understanding
that a statistical draw means a draw?  so pick one already!
however ...
it should go to the bot which has won the most rounds up to then ...
because with 'statistical draw' set up as it is now ...
a challenger can have won many more rounds than the defender ...
and although it might be fine to give the match to the defender
IF it has already been established as being ranked higher ...
we are talking here about establishing the ranking in the first place ...
the 'intial' ranking ... establishing a league ranking ...
and in that case, when no 'order' has yet been imposed ...
there is no 'challenger/defender' per say ...
that has yet to be determined ...
so it should go to the bot having the most rounds won at that
time, in that particular case ... because that's all we have to
work with.
and I am telling you, as it is set up now ...
there are factors other than bot skill in play here ...
which have more of an effect in determining who 'ranks higher'.

let me give you an example of how the present 'initial ranking' ...
ie ... the establishing/rerunning of a league ...  
leaves something to be desired ...
that the results will vary greatly depending upon
the initial positions/ranking.
[if you don't believe it ... run the following league ...
then randomly alter the initial order and run it again.]

I just finished running a 'mini-league' of six bots ...
starting positions [everybody has to be somewhere]
1 - Carnatus Orbis  
2 - Kyushu
3 - HDV4  
4 - Vex Pefidiosos  
5 - Devincio Eversor  
6 - D Scarab 3  

the League results, final ranking:

1 - Carnatus Orbis  
2 - Vex Pefidiosos  
3 - HDV4  
4 - Devincio Eversor  
5 - Kyushu  
6 - D Scarab 3  

now ... as it turned out, D.Scarab only had one match ...
which it lost to Kyushu [for some strange reason] ...
so it never had a chance to go up against anyone else.
also ... Carnatus Orbis only had 3 matches ...
and none here ever went up against all other 5.
there were actually a total of only 9 matches played ...
out of the 15 possible combinations/iterations.
[1-2, 2-3, 1-3, 2-4, 3-4, 1-4, 2-5, 3-5, 2-6 using the initial order]

so in addition to keeping track of who beat who in
league play, I also ran those other contests ...
[1-5, 1-6, 3-6, 4-5, 4-6, 5-6] note 6's (D Scarab) 4 missed matches
with a surprising result ...
ole D Scarab 3 kicked butt not only on Devincia Eversor,
HDV4, and Vex but  Carnatus Orbis as well!!!
yet here he is ranked 6th, 5 behind the 'champ' ...
whom he is able to defeat.

in a one-on-one test of all 15 combinations of two bots ...
here is my final tally and what the ranking should be, imo:

1 - D Scarab 3 ........... W-4  L-1
2 - Carnatus Orbis ..... W-4  L-1  
3 - Vex Pefidiosos ...... W-3  L-2  
4 - HDV4 ................... W-2  L-3
5 - Devincio Eversor ... W-1  L-4  
6 - Kyushu ................. W-1  L-4  

D.Scarab over C.Orbis 'cause in their
match he prevailed.
quite a difference, eh?

so ... just trying to point out ...
to bring to attention ...
when establishing the initial league ranking ...
ie ... running a league the first time ...  
the results are going to be highly dependent upon
the starting positions of the bots ...
as this will determine who gets to go up against who ...
and I'm sorry, imo, ime ...
that does not result in a ranking that can be considered
to be 'statistically correct'.
it might look/sound good' ...
but me thinks we are fooling ourselves.

again, please note:
I am talking about the initial running of a league to
determine that initial ranking.
do we really want the chance placement/positioning of
bots to be what determines the ranking?
I think not.
we want the bots to be able to be what determines
who ends up where.
do we not?

Jez:

--- Quote from: EricL ---I think I've found the crash.  Will post a buddy drop later tonight.
--- End quote ---

Thanks Eric, I look forward to trying it, if you put the source code up as well I shall use it with the debugger.


--- Quote ---how long do we let it do so before understanding
that a statistical draw means a draw? so pick one already!
--- End quote ---
Using the 'defender wins if contest exceeds' option is akin to picking one. Afterall, whatever the score at that point, even if it looks like one bot is winning hands down, the result is no more or less than a draw.
Also don't forget that when I run the league there will be no max limit of rounds. In all the reruns I have tried so far I have only see them break the 400 round barrier in two matches so far.

The reason the defender wins; consider the way the leagues have been run up till now, it has always been the newer bot as the challenger, the newer bot is considered to have the advantage afterall.

Considering the point that the initial order of the bots makes a difference to the final result, which I am not questioning, would you feel happier if the initial order of the bots was set by their age? Oldest first - newest last? Shouldn't be to hard to set up. I would be happier doing it like that than changing it so that it was most rounds won at a certain point and it would be easier to do than a Round Robin for the whole league.

Griz:

--- Quote from: Jez ---My results for the second round of the RR, (the One v all) so far stands at:

The One v:
Callidus (lost 0/5)
--- End quote ---
???
I assume you mean The One lost 5 times?
that's what I found.


--- Quote ---Spanish C 5 crashes (error76)/ 3 non existant bot results/ 2 freezes
--- End quote ---
hmmmmm ....
is this the crash Eric said he thinks he has sus'd out?
I haven't seen that one in a long time ...
except when it's my own fault, failure to get the path right.

I had The One and Spanish C locked in battle ...
19 rounds to 19 rounds  ...
the 39th round going on and on and on ...
so called them a Draw
 
so I have
The One vs  
Won 7W-1L ..........  Din
Won 5W-0L ..........  Darth
Draw 19W-19L ...... Spanish C
Lost 0W-5L against Callidus
Lost 1W-7L ..........  Animal S

have additional results for all 6 ....
but have to compile them.
will put them up later today if I get a chance.
also the next 12 will follow before long.

Griz:
F1 League Experimentation
the 'first' 6 bots

Callidus vs
Won .. 5-0 ... Animal S
Won .. 5-0 ... Spanish C
Won .. 5-0 ... The One
Won .. 5-0 ... DIN
Won .. 5-0 ... Darth  

Spanish C vs
Won .. 5-0 ....  Animal S
Won .. 5-0  .... DIN
Draw 19-19 ... The One
Draw 19-19 ... Darth
Lost ... 0-5 ..... Callidus

Animal S vs
Won .. 7-1 ... The One
Won .. 5-0 ... Din
Won .. 5-0 ... Darth
Lost ... 0-5 ... Callidus
Lost ... 0-5 ... Spanish C
 
The One vs  
Won .. 7-1 ..... Din
Won .. 5-0 ..... Darth
Draw 19-19 ... Spanish C
Lost ... 0-5 .... Callidus
Lost ... 1-7 .... Animal S

DIN vs
Won .. 5-0 ... Darth
Lost ... 0-5 ... Callidus
Lost ... 0-5 ... Animal S
Lost ... 0-5 ... Spanish C
Lost ... 1-7 ... The One

Darth vs
Draw 19-19 .. Spanish C
Lost ... 0-5 ... Callidus
Lost ... 0-5 ... Animal S
Lost ... 0-5 ... The One
Lost ... 0-5 ... Din

basing a 'draw' on 38 rounds ...
just because that's where these two
ended up ... with no end in sight ...
the point at which, imo:
'statistically no winner' = a draw.

and so my ranking based on W-L-D ...
W = 1, L = -1,  D = 0.5
Bot .............. W ... L ... D
1 - Callidus ....... 5 ... 0 ... 0 ....... 5 pts  
2 - Spanish C ... 2 ... 1 ... 2 ....... 2 pts
3 - Animal S ...... 3 ... 2 ... 0 ....... 1 pt  
4 - The One ...... 2 ... 2 ... 1 .......  0.5 pt
5 - DIN .............. 1 ... 4 ... 0 ....... -3 pt  
6 - Darth ........... 0 ... 4 ... 1 ....... -3.5 pt  

which in this case happened to end up
the same as what came out of Leagues:
 
1 - Callidus  
2 - Spanish Conquistador  
3 - Animal Supremus  
4 - The One  
5 - DIN  
6 - Darth Shimazu  

but this isn't always the case.
I found some ranking 'descrepencies' while
running the next two batches of 6 ...
which I'm still compiling.
the League 'ranking' imo, not taking into
account the influence that the initial starting
order imposes upon the resultant ranking.
[see details of that in the post I made earier
today ... #71 in this thread ....
just 2 or 3 up from here.]

I'm just sayin' ....
if we are going for statistiacal validity ...
let's not miss the forest for the trees, eh?
 
just raising some issues that stick out
from my perspective.

Griz:

--- Quote from: Jez ---The reason the defender wins; consider the way the leagues have been run up till now, it has always been the newer bot as the challenger, the newer bot is considered to have the advantage afterall.

Considering the point that the initial order of the bots makes a difference to the final result, which I am not questioning, would you feel happier if the initial order of the bots was set by their age? Oldest first - newest last? Shouldn't be to hard to set up. I would be happier doing it like that than changing it so that it was most rounds won at a certain point and it would be easier to do than a Round Robin for the whole league.
--- End quote ---

you are still missing the point, Jez ...
regardless of how you set them up ...
unless every bot gets a shot at every other bot ...
the 'ranking' isn't going to be accurate or fair to
all bots, and the ranking is NOT going to correct.
once again ...
I am talking about setting up the initial ranking in a league ...
when no order has previously been determined.
that's a little different than having some new
bot challenge an established league/ranking ...
but even then, there are some problems ...
as I found with with D Scarab 3.
Kyushu stopped him ...
but that was the only match Kyushu won ...
losing to everyone else above him ...
but still effectively barring D Scarab 3 from taking
them on and defeating every one of them!
for all I know, D Scarab may rise even further.
this is where the 'ranking' as is,  falls down.
sorry ... that's a fact jack ...
and I'm sorry if it makes it more difficult ...
but if we are so concerned with being statistically correct ...
then let's actually do it correctly.  
otherwise we are just pretending to have significant findings ...
remaining caught up in the details of justifying extending the
match to run endless numbers of rounds ...
when there are glaring errors in the set up that render all
of those precise calculations moot.
like ignoring there is an elephant in the room.

well ... imo ...
ideally ...
Leagues should be separated from the main DB ...
and then set up so each bot goes up against every
other to establish the initial 'pecking order'.
how a new bot would then challenge ...
is something I haven't thought much about yet ...
but I would still lean away from the present way ...
so one bot that might have his number ...
wouldn't prevent him from challenging others.

smaller leagues would be a big help, eh?
then giving the top ranked bot a shot at the
next level up.

btw ...
running 6 bots in leagues ...
5 would be the minimum # of matches ...
[each defeating the bot ranked directly below]
to rank them ...
and I think it could go as high as 15.
I'm finding it taking 9, 10, 11 so far.

now with 6 bots, the max # of matches needed
in order for each to go against each is 15.
so if we are already running around 10 matches
on average ... and running just 1/3 more would
eliminate all the inaccuracies ...
might it not be worth the time to do so?
if we really are interested in being accurate?

I don't know that this ratio holds for larger leagues ...
will have to think on it and experiment a bit.
not sure just how Eric has the sequence set up ...
seems like he mentioned tweaking it a bit
recently, but I haven't found just where yet.

just how difficult would it be to set it up so
each bot played each, Eric?
I would think it might be less complicated than
it currently is.

well ...
whatever.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version