Author Topic: League Problems  (Read 26862 times)

Offline Jez

  • Bot Overlord
  • ****
  • Posts: 788
    • View Profile
League Problems
« Reply #75 on: January 02, 2007, 07:08:45 PM »
Griz, I have already changed the league far more than I intended at the start. In the interests of continuity and out of respect for all the good work PY has done in the past, it will remain a challenge league while I am running it.
I have already trod on the toes of tradition by starting a rerun, the rules that the league was set up under were that it was a challenge match and bug exploiting bots, once they had gained their position, were there until a newer bot beat all the other bots underneath them.
The very first league was just a list on the forum and was run in just the same way. I can’t/won’t change that of my own accord.
Out of respect for all the hard work Eric puts in to chasing bugs down as quickly as possible and the fact that some of the older bots will survive in the league longer if these ‘cheat’ bots are removed (plus previous complaints about non-working bots being in the league at all) I thought a quick shake up of the current standings would be a acceptable though.

The current order of things in the leagues has been determined by over 4+ years of matches.
Re-entering the bots in the current shake up by order of age would mean D. Scarab (2004) would be entered before Kyushu (2006)

A Round Robin style league, when the ability is within the program would make for an interesting and alternative style of league, until then, without the win button, or even with the win button, it would be a manually intensive, long drawn out way to decide the results.
I know you have done a lot of work along these lines, if you want to work out what the results would be I would be happy to create a Round Robin league so you can post the results. It is no good just setting up the initial rankings this way if you don’t allow new entries the same chance after all.

I do understand the point you are making but in the context of challenge matches the results will be both accurate and fair.
« Last Edit: January 02, 2007, 07:11:46 PM by Jez »
If you try and take a cat apart to see how it works, the first thing you have in your hands is a non-working cat.
Douglas Adams

Offline Numsgil

  • Administrator
  • Bot God
  • *****
  • Posts: 7742
    • View Profile
League Problems
« Reply #76 on: January 02, 2007, 08:25:57 PM »
About the statistical significance of the leagues, each match is a statistical test with n trials that attempts to test the null hypothesis that two bots are evenly matched.  The number of trials n increases until that null hypothesis is rejected.

A contendor is given the title of victor in a league match if it wins 1/2 n + sqrt(n) rounded up rounds.  It should be easy to see that eventually, this will reduce to: whichever bots wins the majority of rounds.  This happens when n approaches infinity (because the big O of 1/2n + sqrt(n) = big O of n).

Now, I don't know what sort of confidence interval this is using.  The thing with statistics, is that when you reject the null hypothesis you're never sure if you're not making an error.  Usually a confidence interval of 95% is used, which means that you'll be wrong when rejecting the null hypothesis 5% of the time.  Which means that when you run the leagues over, you might get a different result, because there's still that 5% error.  Or 1%, or .001 %, or whatever the confidence interval happens to be.  I'm going over my stats notes now to find out what confidence interval we're using.

The final league standings don't represent what bot is the "best".  It should be easy to see that the only possible ranking in a round robin tournament is by groups.  Ie: 0 losses, 1 losses, etc.  What a ladder represents, rather, is a somewhat arbitrary, but <I>fast</I> way of ranking contendors.  In general, the relative rankings represent relative strength.  But there are exceptions.  Run any real world ladder twice and you'll get 2 different results.

If we want to rank bots based on their absolute strength, we would need to have n^2 matches, where n is the number of bots.  We would need to use a chi squared test to rank them.

I'll present a well researched report on probability models to use in the leagues in a few hours.
« Last Edit: January 02, 2007, 08:26:31 PM by Numsgil »

Offline EricL

  • Administrator
  • Bot God
  • *****
  • Posts: 2266
    • View Profile
League Problems
« Reply #77 on: January 03, 2007, 12:56:48 AM »
Well said both of you.

Without getting too far off topic, I would ultimatly like to see a real-time 'league' of internet connected sims where agregated statistics of the overall populations of each species in the distributed, connected "eco-verse" were available in real-time.  Imagine 20 (or 200, or 2000) distributed, teleporter-connected sims, running on each of our machines.  They could come and go, but some subset would always be running (a screen saver would help increase this number).  Each would report local population statistics to a central repository that everyone could access via a web page.  To compete, you simply build (or evolve) your bot, testing it off line in your own sim, and when you are ready, you connect your sim to the eco-verse.  How well your bot does, how high it places, would be a function of the overall population it acheives relative to all the others.  Now, each sim need not be running the same conditions.  There would be niches.  Some bots would not survive long in some local sims, perhaps no single species could be desinged to fully dominate every corner of the eco-verse yet some would acheive higher populations than others.  Combat would be a viable strategy but so would running and hiding and multiplying quietly.

I'm all for the tradition of 1:1 all out combat, but the real test would be to survie and compete simultaniously against the vast variety of bots in vast variety of environmental conditions in the eco-verse.  Ohh, I get goose flesh just thinking about it!
Many beers....

Offline Griz

  • Bot Overlord
  • ****
  • Posts: 608
    • View Profile
League Problems
« Reply #78 on: January 03, 2007, 11:51:53 AM »
Quote from: Numsgil
About the statistical significance of the leagues, each match is a statistical test with n trials that attempts to test the null hypothesis that two bots are evenly matched.  The number of trials n increases until that null hypothesis is rejected.

A contendor is given the title of victor in a league match if it wins 1/2 n + sqrt(n) rounded up rounds.  It should be easy to see that eventually, this will reduce to: whichever bots wins the majority of rounds.  This happens when n approaches infinity (because the big O of 1/2n + sqrt(n) = big O of n).

Now, I don't know what sort of confidence interval this is using.  The thing with statistics, is that when you reject the null hypothesis you're never sure if you're not making an error.  Usually a confidence interval of 95% is used, which means that you'll be wrong when rejecting the null hypothesis 5% of the time.  Which means that when you run the leagues over, you might get a different result, because there's still that 5% error.  Or 1%, or .001 %, or whatever the confidence interval happens to be.  I'm going over my stats notes now to find out what confidence interval we're using.

The final league standings don't represent what bot is the "best".  It should be easy to see that the only possible ranking in a round robin tournament is by groups.  Ie: 0 losses, 1 losses, etc.  What a ladder represents, rather, is a somewhat arbitrary, but <I>fast</I> way of ranking contendors.  In general, the relative rankings represent relative strength.  But there are exceptions.  Run any real world ladder twice and you'll get 2 different results.

If we want to rank bots based on their absolute strength, we would need to have n^2 matches, where n is the number of bots.  We would need to use a chi squared test to rank them.
not so ...
you only need n(n+1)/2 ...
which in the case of 30 bots, is 465.
btw, leagues of 10 is only 55, a reasonable number
to run without taking days to do so.

and the chi-sqr test wouldn't rank them? ...
it's purpose is to tell you if your results fall withing a
range that is acceptable, that you can be confidant in.
please tell me exactly what data you would be using to
run this chi-sqr test in this case?

and once again ...
people, please hear what I am saying ...
I don't care how precise you think you are being in
calculating how many rounds it take to find a statistically
valid winner of a given match ...
when the 'arbitrary' initial order that you start the bots with ...
upon attempting to establish league standings ...
will have a much greater affect on their ranking than does
all your playing with numbers ...
unless every bot is not allowed to go up against every other.
that is all I am saying.

now you can dance around that all day long ...
it won't change a thing.
don't get so caught up in the details that you miss
the bigger picture here.

and don't shoot the messenger ...
just 'cause you don't like the message.

do it however you want ....
but please ...
don't pretend it's statistically valid as it is now.
it isn't.
不知
~griz~
[/color]
   "The selection of Random Numbers is too important to be left to Chance"
The Mooj  a friend to all humanity
[/color]

Offline Griz

  • Bot Overlord
  • ****
  • Posts: 608
    • View Profile
League Problems
« Reply #79 on: January 03, 2007, 12:13:09 PM »
Quote from: Jez
it would be a manually intensive, long drawn out way to decide the results.

not really ...
regardless of the what method you use to
run a league, a 30 bot league is going to take you
one hell of a long time ... IF you ever even get thru
it without a crash somewhere along the way.
[let me know if/when you ever get it completed]  lol

and as I've pointed out ...
for each bot to meet every other ...
it's n(n+1)/2 ... not n^2
so ...
30 bots = 465 matches
20 bots = 210
10 bots = 55
8 bots = 36
and I still think we would be much better off making
4 leagues of 8 bots, 32 all told ...
then allowing the top one or two in each ...
to challenge the league ranked above them.
a new bot could start off challenging the lowest league ...
and be allowed to move to others if he is good enough
to 'make the cut'
compiling the rankings in this way ...
would not only be more accurate ...
but would actually make the project manageable.
 
Quote
I do understand the point you are making but in the context of challenge matches the results will be both accurate and fair.
well ... close perhaps, but I don't see how you can say accurate.
of course one could always take a given bot that has been 'stopped' ...
and put it up against a higher ranked bot outside of leagues just
to see if it was a fluke, and if so, present the results and request
a rematch or re-evaluation.

but ... whatever.

Act as you will.
Go on as you feel.
This is the incomparable way.

I've got all those pesky real world things
that better deserve my attention anyway.

good luck.
不知
~griz~
[/color]
   "The selection of Random Numbers is too important to be left to Chance"
The Mooj  a friend to all humanity
[/color]

Offline Numsgil

  • Administrator
  • Bot God
  • *****
  • Posts: 7742
    • View Profile
League Problems
« Reply #80 on: January 03, 2007, 01:05:21 PM »
n(n+1) / 2 approaches n^2 as n gets very large (actually, it approaches 1/2 n^2, but big O ignores those coefficients), which is the point I was making.  You're going to be running alot of rounds, no matter how you slice it.  Suppose you run 1000 bots in a round robin tournament.  That's going to be something on the order of half a million matches.  That's a lot of matches.

It's okay that initial order matters in the ranking, because we have a prechosen method for the preranking: age.  Initial ranking is always going to matter, there's no getting around that in a ladder.  But using seniority for the initial ranking makes the most sense, for reasons Jez outlined above.  In your own personal ladder, feel free to chose whatever initial ordering you like.  Initial ordering does matter, but so long as your choice is arbitrary most of the league is going to be ordered correctly.

The idea isn't that the ladder is a perfect representation of the strength of the bots.  The ladder is simply a quick way to relatively sort them.  That's why they're used in sports, people aren't patient.

I'll post in suggestions forum with my stats findings.

Offline Jez

  • Bot Overlord
  • ****
  • Posts: 788
    • View Profile
League Problems
« Reply #81 on: January 03, 2007, 01:08:05 PM »
I’m not sure exactly what good work Eric has done for this last buddy drop, I’ve been running it under VB in case of crashes but it seems to run faster than before.
The current standing, which I think has been affected by the latest changes, is as follows:
    [1]Din
    [2]Destinatus P
    [3]The One
    [4]Dominator I
    [5]Darth S
    [6]Spanish C (103 wins)
    [7]Animal S (122 wins)
    [8]Carnatus Orbis
    [9]Kyushu
    [10]Una 3.0[/li]
I don’t remember what the formula was that is used to decide the winner now, annoying sometimes when I want to work out how many more rounds minimum it might take for a bot to win a match, it will be interesting to see Nums’ report on probability models to use in the league.
Eric, should you ever have time to implement a better way of running the league, you have some good ideas about how it could be done, then you have my support and I would be happy to help in any way I can.
A multi enviromental test via internet linked pc’s using screensavers etc really would be the bees’ knees of competition!

Griz, I am not questioning your mathematical abilities, you far exceed my abilities in that, or probably any other, field.
It is the premise that the start order is arbitary that I disagree with, this isn’t a reformation of the league from scratch, would need to run all existing bots if that were the case; it is a shake up of the existing placement of the bots in the league, something that is far from arbitary.
The comment I made about a RR being ‘a manually intensive, long drawn out way to decide the results’ is based on the fact that there is no way for the program to do it automatically at the moment and that AFAIK a RR where every bot fights every other bot is undoubtedly going to take longer than a challenge match where bots don’t always compete against all the other bots.

I had no intention of ‘shooting the messenger’ you are undoubtedly correct in what you say, I am arguing over the premise that the starting order is arbitary.
I don’t see why, if the starting order is considered valid and the method of competition (challenge) is valid then the results don’t hold some statistical relevance.

(contest results are now Spanish C 110 – Animal S 130)
If you try and take a cat apart to see how it works, the first thing you have in your hands is a non-working cat.
Douglas Adams

Offline Griz

  • Bot Overlord
  • ****
  • Posts: 608
    • View Profile
League Problems
« Reply #82 on: January 03, 2007, 03:18:04 PM »
Quote from: Jez
I’m not sure exactly what good work Eric has done for this last buddy drop, I’ve been running it under VB in case of crashes but it seems to run faster than before.
The current standing, which I think has been affected by the latest changes, is as follows:
    [1]Din
    [2]Destinatus P
    [3]The One
    [4]Dominator I
    [5]Darth S
    [6]Spanish C (103 wins)
    [7]Animal S (122 wins)
    [8]Carnatus Orbis
    [9]Kyushu
    [10]Una 3.0[/li]
[/quote]
well ... this will be drastically altered!
Quote
Griz, I am not questioning your mathematical abilities, you far exceed my abilities in that, or probably any other, field.
It is the premise that the start order is arbitary that I disagree with, this isn’t a reformation of the league from scratch, would need to run all existing bots if that were the case; it is a shake up of the existing placement of the bots in the league, something that is far from arbitary.
understood. if that were indeed the case.
I don't think in this case, it is.
an example:
the list you provide above going to be anywhere
near what the ranking will end up being.
Din is going to drop to around #5 ...
Dominator Invincibalis to around 8th ...
Carnatus Orbis out of the top 10 ...
Kyushu down to around 20 ... etc.
and this isn't taking into consideration
that there are other bots farther down
the line that will still come up and take
them down a bit further.
 
so while this may have been a valid ranking at one time ...
from the days of 2.37???? or did someone rerun leagues
since 2.4X? ...
I don't know that it still is.
well ...
having run a bunch of contests ... I know it isn't.

anyway ... bots behave much differently now, eh?
the physics of the program have been altered ...
and costs and lots of stuff ...
so you are going to run into the very problem I've been
attempting to draw attention to ...
I've already run into them ... and showed you one example
being that of D Scarab 3 being prevented from moving up.
[which would have happened to him in a challenge as well, btw]

in fact, even as we speak, I have him challenging the 6 bot
mini league ranked just ahead of him ... #7-#12
and so far he as taken out both  Carnatus Orbis & Duplo Simpleboticus  
and is working now on James 4 ...
8 rankings above were the league stuck him down at #18.
so he's still on his way up.
Quote
The comment I made about a RR being ‘a manually intensive, long drawn out way to decide the results’ is based on the fact that there is no way for the program to do it automatically at the moment and that AFAIK a RR where every bot fights every other bot is undoubtedly going to take longer than a challenge match where bots don’t always compete against all the other bots.
right. having to manually doing this is a big pain.
that's why I'm suggesting we consider making some changes.

as far as challenge matches go ...
that is a slightly different animal ... but I have similar problems with it.

as far as rerunning/establishing a league ... as it is now ...
it won't usually take as long ... but there is no guarantee of that.
it is possible they would have to do as many .... possible, if not probable.
however, just as probable as it only doing the minimum number.
I would expect they would 'average' about half ... probability being what it is.
so you do a trade off ...
trading speed for accuracy ...
it's a balancing act ...
which I don't have a problem with, you do what you have to ...
but lets not pretend it is then as correct as it could be ...
or can justify using all that time for statistical calculation
re  # of rounds required to win ...
because the error introduced in the initial placement ...
is far greater than that, and will nullify all those great
efforts at being precise.
hmmmmmm ....
looking for another analogy.
ok ... like spending hours/days carefully stacking up 100,000 dominoes to
make a really cool display when you finally stand back and tip that first one ...
all that time there being a great dane running around in the room.
~~~
Quote
I had no intention of ‘shooting the messenger’ you are undoubtedly correct in what you say, I am arguing over the premise that the starting order is arbitary.
I don’t see why, if the starting order is considered valid and the method of competition (challenge) is valid then the results don’t hold some statistical relevance.
right. and I am saying they, and your premise, are not valid.
so I guess we can agree we disagree.

Quote
(contest results are now Spanish C 110 – Animal S 130)

let me know what they are up to in another week.  lol
see ... here we are at the place I don't get ...
it is a statistical draw!!!!
why keep screwing around with it?
why not give it to Animal S ... he's got more rounds ...
and move on.
if you are concerned about the time to run leagues ...
I suggest you take a look at what is consuming much
of that time ... and determine if it is worth it or not.
I don't happen to think it is ... but ...
whatever.

hey ...
made a mistake before for # of matches required
for all bots to go up against each other  ...
had a + there instead of a -
it isn't n(n+1)/2  but n(n-1)/2
 
3 bots = 3 matches: ie  A-B, A-C, B-C
6 bots = 15
8 bots = 28
10 bots = 45
30 bots = 435  

I'm tellin' ya ... those 8 bot sub-leagues are looking
better to me all the time.
« Last Edit: January 03, 2007, 03:23:32 PM by Griz »
不知
~griz~
[/color]
   "The selection of Random Numbers is too important to be left to Chance"
The Mooj  a friend to all humanity
[/color]

Offline EricL

  • Administrator
  • Bot God
  • *****
  • Posts: 2266
    • View Profile
League Problems
« Reply #83 on: January 03, 2007, 03:28:22 PM »
Quote from: Jez
I’m not sure exactly what good work Eric has done for this last buddy drop, I’ve been running it under VB in case of crashes but it seems to run faster than before.

I did make one change in 2.42.9s that probably helps perfromance of leagues in how unique IDs are assigned to bots.  If you look in the properties dialog of a bot, there are two numbers there.  One is the "Robot ID" which the unique number of the bot in the sim at the momnet.  It's basically the index the bot is using in the internal bot array.  That is, no other extant bot has that number currently, but numbers get re-used as bots die.  The second "Unique Robot ID" is unique for the life of the sim.  No other bot in that sim will ever have that number again.  These unique numbers are used for ancestor trees, teleporters and other places where having unique bot IDs is required.

The old code used a really computentional intensive way of assigning these unique IDs.  I changed it to simply use a monotomicaly increasing counter that gets saved in sim files.

Quote from: Jez
Eric, should you ever have time to implement a better way of running the league, you have some good ideas about how it could be done, then you have my support and I would be happy to help in any way I can.
A multi enviromental test via internet linked pc’s using screensavers etc really would be the bees’ knees of competition!

Thanks.  Anything like what I suggest would be in addition to the current league functionality.  Terrarium had this and it was super cool.
Many beers....

Offline Griz

  • Bot Overlord
  • ****
  • Posts: 608
    • View Profile
League Problems
« Reply #84 on: January 03, 2007, 04:19:50 PM »
can't seem to reply/post to Num's stat thread ...
so will put it here.
Moved to League stats tests ~~~Jez
~~~
« Last Edit: January 03, 2007, 04:57:51 PM by Jez »
不知
~griz~
[/color]
   "The selection of Random Numbers is too important to be left to Chance"
The Mooj  a friend to all humanity
[/color]

Offline Jez

  • Bot Overlord
  • ****
  • Posts: 788
    • View Profile
League Problems
« Reply #85 on: January 03, 2007, 04:44:59 PM »
Quote from: Griz
this may have been a valid ranking at one time ...
from the days of 2.37???? or did someone rerun leagues
since 2.4X? ...
To the best of my knowledge the (F1) league has never had a rerun, even when PY programmed the league function into DB the existing list on the forum was used as the starter list for the league. (which wasn't rerun, bots have just been added to it since.) There may have been a rerun done, for some if not all of the bots, when the statistical analysis of rounds was introduced. ( I pressed for and researched the formula because one of my bots didn't gain the league position I knew it should get.)
It might be better to consider it as wrong of me to rerun the league at all, especially without reentering all the existing bots in the bestiary in date order...

Quote
so you do a trade off ...
trading speed for accuracy ...
it's a balancing act ...
which I don't have a problem with, you do what you have to ...
but lets not pretend it is then as correct as it could be ...
or can justify using all that time for statistical calculation
re  # of rounds required to win ...
because the error introduced in the initial placement ...
is far greater than that, and will nullify all those great
efforts at being precise.
Indeed, you are calling into question, for me, if rerunning the league will result in more accuracy than we have already, perhaps better to leave it as it is and wait for a new method of running the leagues, allowing for re-entry of all the bots and better analysis of the results.
The league as it stands is, to use your analogy, 4+ years of stacking dominoes. Is it right to change that because some dominoes are looking old?

Quote
and I am saying they, and your premise, are not valid.
so I guess we can agree we disagree.
let me know what they are up to in another week.  lol
Funnier than you think but I'll tell you about that in a min...
Quote
see ... here we are at the place I don't get ...
it is a statistical draw!!!!
why keep screwing around with it?
why not give it to Animal S ... he's got more rounds ...
and move on.
if you are concerned about the time to run leagues ...
I suggest you take a look at what is consuming much
of that time ... and determine if it is worth it or not.
I don't happen to think it is ... but ...
It's the VB debugger that's taking the time, consider that's making it 10x slower than it would be otherwise, it's only taken ~20 hrs to get this far.
Quote
hey ...
made a mistake before for # of matches required
for all bots to go up against each other  ...
had a + there instead of a -
it isn't n(n+1)/2  but n(n-1)/2
 
3 bots = 3 matches: ie  A-B, A-C, B-C
Reminds me of the 'gypsie rose' problem (same thing really) they gave the maths class once at school, took me 90 seconds to figure it out and two weeks to fail to make the answer into a formula  

Anyway; at 258 rounds, 114 for Animal S - 143 for Spanish C the bots disappeared and I'm left with a bunch of veg firing viruses at each other!
« Last Edit: January 03, 2007, 04:48:39 PM by Jez »
If you try and take a cat apart to see how it works, the first thing you have in your hands is a non-working cat.
Douglas Adams

Offline Numsgil

  • Administrator
  • Bot God
  • *****
  • Posts: 7742
    • View Profile
League Problems
« Reply #86 on: January 03, 2007, 08:30:03 PM »
Quote from: Griz
so you are going to run into the very problem I've been
attempting to draw attention to ...
I've already run into them ... and showed you one example
being that of D Scarab 3 being prevented from moving up.
[which would have happened to him in a challenge as well, btw]

This is the difference in our opinions.  You view this is a problem, I do not.  It is mostly impossible to rank bots in a linear fashion anymore, because the program is getting to a rock-paper-scissors strategy.  It's okay if stronger bots get stuck in the bottom of the ladder.  Thems the breaks.

Quote
...which I don't have a problem with, you do what you have to ...
but lets not pretend it is then as correct as it could be ...
or can justify using all that time for statistical calculation
re  # of rounds required to win ...
because the error introduced in the initial placement ...
is far greater than that, and will nullify all those great
efforts at being precise.

This is a bias, not an error.  Run the same initial ordering of bots 10 times.  If there's any deviation in the placings, that's error.  There should be very little error.  If you run 10 different initial orderings and get 10 different results, that's bias.

Bias is inherant in all ladders.  But that's okay, because we have a fair way of ordering the initial robots.  It's okay if a strong bot is stuck at position #25.  Bots at the top of the list should not only be strong, but capable of defefating some of the tricky bots (I believe there was some Umbra bot that stops alot of bots from proceeding in the leagues).

I would use either bots' ages or present ranking in the league to setup the initial ranking.  Ages would be best, since most older bots rank low in the league.

..........................

If you want a fair way of running the leagues, what about something like this:
take some bots, randomly order them into an initial order.  Run the league.  Reverse the new order of the bots (so 1st place becomes last place, etc.) and rerun the league.  Keep doing this until the rankings stop fluctuating every time you run the leagues.  I'm guessing they'll never stop totally fluctuating, but you'll definately have the case that bots at the top of the league are strong, which is what you want.

Quote
see ... here we are at the place I don't get ...
it is a statistical draw!!!!
why keep screwing around with it?
why not give it to Animal S ... he's got more rounds ...
and move on.
if you are concerned about the time to run leagues ...
I suggest you take a look at what is consuming much
of that time ... and determine if it is worth it or not.
I don't happen to think it is ... but ...
whatever.

There's no such thing as a statistical draw when you can run an arbitrary number of rounds.  Supposing for a moment that you managed to find 2 bots that are identically matched (exactly 50/50), there isn't a test for that.  So maybe we should add one.  But I warn the number of matches you need to run to determine a true statistical draw is probably in the thousands.

Try this simple experiment.  Hack into the league code, and set up a league match with a "fair" coin (assign a random winner based on a 50/50 probability).  Run the league.    It should be an enlightening experience either way.

I imagine a league winner will eventually be declared, which, come to think of it, isn't good.  We should add a catch for when the results are indicative of a true statistical draw.  But again, we're talking possibly thousands of rounds.

Just declaring a winner after 200 rounds based on who has the most isn't proper.  Imagine flipping a coin 200 times.  It's not going to end up 100/100.  Would you declare the coin unfair if it was 130/70?  That's where stats comes in.  Stats assures us that we have control over arbitrarily picking winners.

Offline Light

  • Bot Destroyer
  • ***
  • Posts: 245
    • View Profile
League Problems
« Reply #87 on: January 03, 2007, 08:46:53 PM »
Quote
Supposing for a moment that you managed to find 2 bots that are identically matched (exactly 50/50), there isn't a test for that.
Entering the same bot under 2 different names would give you an exact 50/50 scenario your after.

Offline Griz

  • Bot Overlord
  • ****
  • Posts: 608
    • View Profile
League Problems
« Reply #88 on: January 03, 2007, 10:14:55 PM »
Quote from: Numsgil
n(n+1) / 2 approaches n^2 as n gets very large (actually, it approaches 1/2 n^2, but big O ignores those coefficients), which is the point I was making.  You're going to be running alot of rounds, no matter how you slice it.  Suppose you run 1000 bots in a round robin tournament.  That's going to be something on the order of half a million matches.  That's a lot of matches.
we aren't talking 1000 bots ... but 30 ... 435 matches.

Quote
It's okay that initial order matters in the ranking, because we have a prechosen method for the preranking: age.  Initial ranking is always going to matter, there's no getting around that in a ladder.  But using seniority for the initial ranking makes the most sense, for reasons Jez outlined above.  In your own personal ladder, feel free to chose whatever initial ordering you like.  Initial ordering does matter, but so long as your choice is arbitrary most of the league is going to be ordered correctly.
I'm sorry Nums ... is is not.  
you are missing what I am talking about ...
as you already think you know what I'm saying, and you don't ...
so there's no room left there for you to take a look at what
I'm pointing at.
you're not only not on the same page ...
but not even in the same book.
ok. I'm tired of beating my head against the wall.
forgettabout it then.
 

Quote
The idea isn't that the ladder is a perfect representation of the strength of the bots.  The ladder is simply a quick way to relatively sort them.  That's why they're used in sports, people aren't patient.

I'll post in suggestions forum with my stats findings.
不知
~griz~
[/color]
   "The selection of Random Numbers is too important to be left to Chance"
The Mooj  a friend to all humanity
[/color]

Offline Griz

  • Bot Overlord
  • ****
  • Posts: 608
    • View Profile
League Problems
« Reply #89 on: January 03, 2007, 10:17:46 PM »
Quote from: Numsgil
This is the difference in our opinions.  You view this is a problem, I do not.  It is mostly impossible to rank bots in a linear fashion anymore, because the program is getting to a rock-paper-scissors strategy.  It's okay if stronger bots get stuck in the bottom of the ladder.  Thems the breaks.
This is a bias, not an error.  Run the same initial ordering of bots 10 times.  If there's any deviation in the placings, that's error.  There should be very little error.  If you run 10 different initial orderings and get 10 different results, that's bias.

Bias is inherant in all ladders.  But that's okay, because we have a fair way of ordering the initial robots.  It's okay if a strong bot is stuck at position #25.  Bots at the top of the list should not only be strong, but capable of defefating some of the tricky bots (I believe there was some Umbra bot that stops alot of bots from proceeding in the leagues).

I would use either bots' ages or present ranking in the league to setup the initial ranking.  Ages would be best, since most older bots rank low in the league.

..........................

If you want a fair way of running the leagues, what about something like this:
take some bots, randomly order them into an initial order.  Run the league.  Reverse the new order of the bots (so 1st place becomes last place, etc.) and rerun the league.  Keep doing this until the rankings stop fluctuating every time you run the leagues.  I'm guessing they'll never stop totally fluctuating, but you'll definately have the case that bots at the top of the league are strong, which is what you want.
There's no such thing as a statistical draw when you can run an arbitrary number of rounds.  Supposing for a moment that you managed to find 2 bots that are identically matched (exactly 50/50), there isn't a test for that.  So maybe we should add one.  But I warn the number of matches you need to run to determine a true statistical draw is probably in the thousands.

Try this simple experiment.  Hack into the league code, and set up a league match with a "fair" coin (assign a random winner based on a 50/50 probability).  Run the league.    It should be an enlightening experience either way.

I imagine a league winner will eventually be declared, which, come to think of it, isn't good.  We should add a catch for when the results are indicative of a true statistical draw.  But again, we're talking possibly thousands of rounds.

Just declaring a winner after 200 rounds based on who has the most isn't proper.  Imagine flipping a coin 200 times.  It's not going to end up 100/100.  Would you declare the coin unfair if it was 130/70?  That's where stats comes in.  Stats assures us that we have control over arbitrarily picking winners.

bullshit.
you're talking theory ...
and I'm talking practical application.
but you already know it all  .......... as always so ...
fuck it.
不知
~griz~
[/color]
   "The selection of Random Numbers is too important to be left to Chance"
The Mooj  a friend to all humanity
[/color]