DB3 Leagues

Code center > Darwinbots3

DB3 Leagues

<< < (3/9) > >>

Peter:

--- Quote from: Panda on April 10, 2015, 09:26:23 AM ---A specific league for an "all round bot"? I think that's a brilliant idea but I foresee problems with computation times.
--- End quote ---
Not necessarily, the fights would just have another random attribute, the map and physics. It doesn't have to take more computation time.

I expect fights not to have a definite win/lose like in db2. Continuing till you hit 95% confidence. But let the rating system take care of it, so a fight is just one fight.

--- Quote from: Panda on April 10, 2015, 09:26:23 AM ---Prehaps a system where, when a new bot reaches a certain rating (possibly above the worst bot in the "top league"), it can be entered into the "top league" and pitted against each one of the bots and then placed (or not placed) in the league accordingly? Unless that is what you were trying to say?

--- End quote ---
Is one possibility. You could also extend the "top league" with one more bot each time the ranking in the "top league" is rock solid, or even decrease if multiple new strong bots appear.

Panda:

--- Quote from: spike43884 on April 10, 2015, 06:30:10 AM ---Your point of the entire tournament with floors having to be repeated is incorrect. A bot is entered in, then it starts only floor one, once floor one is finished it goes to floor 2 if it scored highly in floor one, and that process continues till it hits a floor which it struggles in... This stops the problem of 1 species v 1 species scenario's cropping up as well. Only once a bot is entered does a floor need to be repeated...which also allows all bots on that floor to be re-ranked...Maybe to conserve CPU resources we update the positions every 6 hours instead of every tournament, just storing scores between those intervals...

--- End quote ---
Yeah, it would be 1v1 but one loss won't mean it will be knocked out completely, it'll just lower its score a little (as it rightly should), which would be brought back up in the situation of 1 species obliterating so many others.

I would err on the side of using a statistical system since I trust the mathematics rather than your reasoning (sorry :().

--- Quote from: Peter on April 10, 2015, 09:48:47 AM ---
--- Quote from: Panda on April 10, 2015, 09:26:23 AM ---A specific league for an "all round bot"? I think that's a brilliant idea but I foresee problems with computation times.
--- End quote ---
Not necessarily, the fights would just have another random attribute, the map and physics. It doesn't have to take more computation time.

--- End quote ---
So are you saying F1 should be a "all round" or it would be different?

--- Quote from: Peter on April 10, 2015, 09:48:47 AM ---I expect fights not to have a definite win/lose like in db2. Continuing till you hit 95% confidence. But let the rating system take care of it, so a fight is just one fight.

--- End quote ---
Yeah, I agree that it will probably be the same and that the rating system will take care of it.

Peter:

--- Quote from: Panda on April 10, 2015, 09:26:23 AM ---
--- Quote from: Peter on April 10, 2015, 09:48:47 AM ---
--- Quote from: Panda on April 10, 2015, 09:26:23 AM ---A specific league for an "all round bot"? I think that's a brilliant idea but I foresee problems with computation times.
--- End quote ---
Not necessarily, the fights would just have another random attribute, the map and physics. It doesn't have to take more computation time.

--- End quote ---
So are you saying F1 should be a "all round" or it would be different?
--- End quote ---
Aye,as a F1 league. Yet, although I like the idea of having a all round F1 league, it's one of those things that may seen good in theory, but sucks in practice. But support for different league versions seems nice.

Numsgil:
I've been reviewing the literature on how elo, etc. work. I have some observations:

1. Because we can choose which bots fight which other bots, at the simplest we could choose a random sample (with repeats) of bots for a challenger to fight a single round with. From that we can get its win rate, and a confidence bound on that win rate. So your global win rate is 75% +/- 3%, say. And we can control what the worst case +/- factor is by increasing the sample size. Also as Peter pointed out, winning doesn't have to be binary. You could have an 80% win after 100k cycles because you control 80% of the biomass in the sim, so your overall winrate can factor that in pretty easily.

Your final win rate could be your rating, because it's basically an unbiased sample of your actual win rate if we ran an infinite number of rounds against all other bots, and that's ordered the same way that the elo ratings would be. The disadvantage here is that the win rate will change over time as new bots are added to the league, so your rating is not constant over time. But everytime a challenger is added we only need to run the rounds necessary for it to get its global win rate percentage.

If we anchored a bot with a specific elo (the animal minimalis equivalent has say 1000 elo) we could probably figure out elo ratings from the relative win percentages, I think. Something something math math.

We'd have to rerun the leagues after every new version, though, as the win rates are pulled from old matches and no longer valid in that way. But that could form the seasons Panda was talking about.

2. There are N choose 2 ways to pair off N bots. If each pairing produces a probability that A wins over B (for pair (A,B)), call it P(A > B ) (which is just A's win rate for the A-B match), we can take the inverse of the CDF of the unit normal distribution (call it Phi_Inv) and get a system of N choose 2 equations, and least squares solve it for elo ratings. What I mean is: Phi_Inv(P(A > B )) = (s1 - s2) / (sqrt(2) Beta), where s1 and s2 are the elo ratings of A and B respectively, and Beta is the sqrt of the variance in performance of A and B (assuming each bot has the same variance, which is a big assumption but makes the math easier). That's basically what Elo is trying to approximate. But we have the computing power to calculate it directly.

I'm working through the articles on TrueSkill and Glickman right now; they might be doing something more clever. At the very least they factor in confidence intervals. But I think, because we can run exactly the rounds we want to and no more, and can choose how the matches are chosen, we can do a lot more global optimizing and a lot less incremental updating.

...

But yes, in principle I think the statistical approach is pretty compelling. I think that's the way to go for sure.

Numsgil:
Oh, one more thought:

In situations of rock-paper-scissors, where there's a (large) margin of players choosing one strategy over another, the flat win rate I mentioned above would artificially inflate for the minority strategy, I think. Would elo have that same problem? I think not... I'd want to see some simulations, probably.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version