Code center > Darwinbots3

DB3 Leagues

<< < (7/9) > >>

spike43884:
I've been mulling over the elo thing. What about just using multiple factors across the entire simulation and scoring them, instead of relative to their opponents, relative to their performance in the simulation...Do it for multiple battles, maybe even average out the score and then rank them?

Peter:

--- Quote from: Numsgil on April 12, 2015, 04:39:12 PM ---Looks like it places them all around the same skill?  I think that's what we'd want if so.  It's hard to get a sense of the relative scales.    Can you add in a strategy that has a 80% chance to win against any of RPS?  I'd like to see if it's noticeably higher than all the others.

--- End quote ---
Added.
Well the Rocks are placed higher than Scissor, Scissor higher than Paper. But the difference isn't a lot.

id     ConservativeRating    Mean         StandardDeviation      win%    wins\draws\losses\matches\type
0   22.0095591377625    25.2836566772087     1.09136584648208     16.66667%   7   26   7   42   Scissor
1   21.7821811891753    24.8081119506593     1.00864358716132     23.07692%   12   23   11   52   Scissor
2   19.2005201463236    22.5465603332125     1.11534672896296     31.37255%   16   4   28   51   Paper
3   23.1122029152767    26.3987975437787     1.095531542834     53.84615%   28   5   13   52   Rock
4   22.1717569038829    25.5748763424698     1.13437314619566     21.95122%   9   23   4   41   Scissor
5   21.9605295488926    25.3588763504315     1.13278226717963     21.95122%   9   18   8   41   Scissor
6   21.1505654852868    24.269753597743     1.03972937081875     16.66667%   8   24   12   48   Scissor
7   18.2254824660445    21.8923336948571     1.22228374293755     22.72727%   10   4   26   44   Paper
8   21.4013812078553    24.7129946001344     1.1038711307597     20.93023%   9   20   9   43   Scissor
9   22.9477399984015    26.4990987016865     1.18378623442832     51.11111%   23   5   13   45   Rock
10   23.6013905336762    26.8589475968437     1.08585235438918     73.33334%   33   0   12   45   Percent80

After a million random matches the upper 2 sigma of Scissor is still higher than the lower 2 sigma of Rock.
id     ConservativeRating    Mean         StandardDeviation      win%    wins\draws\losses\matches\type
0   22.4396128116937    24.6603104463431     0.740232544883134     20.1069%   33217   82356   33136   165202   Scissor
1   22.3980369678499    24.6155831330575     0.739182055069193     19.99733%   32991   82504   32976   164977   Scissor
2   18.6285481147599    21.0068128521428     0.792754912460967     19.95815%   33000   16463   99321   165346   Paper
3   24.7068243962618    27.0176793425063     0.770284982081475     59.98304%   99038   16415   33058   165110   Rock
4   22.8829721831836    25.1137388028982     0.743588873238181     20.0285%   33030   82475   32812   164915   Scissor
5   23.0007569918914    25.2302209591881     0.743154655765566     19.97823%   33042   82536   33128   165390   Scissor
6   22.8817029092318    25.1019168680217     0.740071319596633     20.11042%   33330   82598   33106   165735   Scissor
7   20.4789112234694    22.7848573440528     0.768648706861115     19.92037%   32920   16463   99293   165258   Paper
8   23.2252037617677    25.4660024444581     0.746932894230129     19.98256%   33004   82413   33157   165164   Scissor
9   25.138645723613    27.4710506811453     0.777468319177446     60.21605%   99277   16415   32862   164868   Rock
10   24.5903878471413    26.8158904155278     0.741834189462153     79.99807%   132500   0   33129   165629   Percent80
Edit: it's wrong

Numsgil:
I find it interesting that after a million matches the one that wins 80% of the time is actually a bit lower than the second rock player.  I guess the system finds its occasional losses to "weak" bots confusing.  Also the flat win rate treats a draw the same as a loss, which is probably over penalizing the rock strategy.

Interesting results, I'll need to mull it over.

Peter:
For reference the results of player 0>1>2  with a million games.

1. The difference is quite big in comparison with RPC + random.
2. Fights with big difference in rating cause NaN errors. Games with NaN errors didn't count in the final results. You can see that in matches played, top and down players got less games registrations. NaN erros may be fixed somewhere in the settings, but at least in this implementation it doesn't like huge power differences. This can also be the reason for the flattened win rates, as a big portion of games got skipped. Might be fixed with different settings, I picked the default rating settings.
id     ConservativeRating    Mean         StandardDeviation         win%       wins\draws\losses\matches
0   157.364160859048    175.093666523749     5.90983522156717     100%      93357   0   0   93357   
1   122.187090794375    137.148117096133     4.98700876725273     68.71486%   94351   0   42957   137308   
2   89.1558840881495    103.337093844302     4.72706991871741     52.63669%   95334   0   85783   181117   
3   57.9310423376475    71.3621156175018     4.47702442661809     51.34298%   95290   0   90305   185595   
4   27.50057942478       40.4213063289044     4.30690896804144     50.3262%   94266   0   93044   187310   
5   -3.32350541905488    9.68240334301635     4.33530292069041     49.87444%   93547   0   94018   187565   
6   -34.637104314333    -21.3219560268339     4.4383827624997     48.72458%   90999   0   95763   186762   
7   -67.5653127637627    -53.3291196269003     4.74539771228745     47.14106%   85512   0   95884   181396   
8   -102.079416500229    -87.1381894801877     4.98040900668033     31.22995%   43028   0   94750   137778   
9   -142.80346240494    -125.061759330556     5.91390102479457     0%      0   0   93180   93180   

Peter:
Uh, there was a mistake with creating wins from the 80% win player. Half of his wins were registered as draws. :redface:

Correct stats.

RPC/random
id     ConservativeRating    Mean         StandardDeviation      win%    wins\draws\losses\matches\type
0   21.2025014802368    24.5072604715912     1.10158633045147     19.04762%   8   27   7   42   Scissor
1   21.1703789746567    24.2605458085435     1.03005561129561     22%   11   24   15   50   Scissor
2   19.3873917419835    22.8479451170323     1.15351779168295     38.77551%   19   3   27   49   Paper
3   23.2809913533546    26.6242733109851     1.11442731921019     58%   29   10   11   50   Rock
4   22.1327373676122    25.5369245974301     1.13472907660596     26.31579%   10   25   3   38   Scissor
5   20.8980341485925    24.3963410032841     1.16610228489721     23.07692%   9   21   9   39   Scissor
6   21.6943165763599    24.9387952612258     1.0814928949553     34.78261%   16   23   7   46   Scissor
7   17.9036343796341    22.1480159446247     1.41479385499687     40%   14   3   18   35   Paper
8   20.7780899350452    24.0535665005142     1.09182552182302     26.08696%   12   24   10   46   Scissor
9   22.3978387144379    25.7676782802645     1.12327985527554     49.01961%   25   10   16   51   Rock
10   26.8815568846738    31.1951332533136     1.43785878954662     76.19048%   32   0   10   42   Percent80

RPC/random after a million games

id     ConservativeRating    Mean         StandardDeviation      win%    wins\draws\losses\matches\type
0   21.8673281444733    24.1097779335181     0.747483263014924     27.83857%   45947   82858   36243   165048   Scissor
1   22.0131180929267    24.2333995808674     0.740093829313591     27.82333%   45948   82799   36395   165142   Scissor
2   17.8442112961662    20.2239537778002     0.793247493877987     27.95814%   46141   16516   102379   165036   Paper
3   24.4196078416116    26.7526319905644     0.777674716317609     68.05129%   112237   16466   36227   164930   Rock
4   21.8505862259689    24.0823444478475     0.743919407292882     28.11717%   46485   82358   36483   165326   Scissor
5   21.9716403885154    24.2016352220766     0.743331611187086     28.10022%   46489   82414   36537   165440   Scissor
6   21.7202167758399    23.9643481135206     0.748043779226886     28.10896%   46479   82355   36519   165353   Scissor
7   19.2396494056951    21.5913457018039     0.783898765369623     27.98824%   46360   16516   102765   165641   Paper
8   21.3340998118027    23.5937401326201     0.753213440272473     28.09844%   46422   82480   36310   165212   Scissor
9   24.2867711234825    26.617695694018     0.776974856845171     68.22602%   112824   16466   36078   165368   Rock
10   26.9820148326741    29.4765897709184     0.831524979414755     80.06544%   132638   0   33024   165662   Percent80

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version