Author Topic: Auto-Species forking algorithm issue with big berthas  (Read 3151 times)

Offline EricL

  • Administrator
  • Bot God
  • *****
  • Posts: 2266
    • View Profile
Auto-Species forking algorithm issue with big berthas
« on: September 20, 2008, 09:18:36 PM »
The algorithm used for forking a species once the forking conditions have been met has a problem in 2.43.1M which can result in species proliferation in certain cases.   Currently, when the forking conditions are met, the code chooses a bot at random on which to base the new species and adds to the new species all of that bot's descendants and any closely related extant ancestors.  This can miss the case where the sole reason a species meets the forking criteria in the first place is due to a single, long lived "big bertha" bot whose accumulated mutations and generational distance are the sole reason for the forking in the first place.

It's probably too computationally intensive to perform a genomic or generational grouping analysis of a species in order to determine clumping of extant members, but the current algorithm has the problem of leaving the long-lived, genetically and generationally distant bot grouped in with it's distant cohorts, making it likely the species will be forked again and again.      

Not precisely sure what I'm going to do about this in 2.43.2 - I need to play around a little - but I will likely include all extant ancestors regardless of generational distance in the new species as well as potentially choosing the oldest bot in the species to form the basis of the new species.
Many beers....

Offline ikke

  • Bot Destroyer
  • ***
  • Posts: 300
    • View Profile
Auto-Species forking algorithm issue with big berthas
« Reply #1 on: September 21, 2008, 04:26:08 AM »
I think I would consider additional parameters. One mutant does not equal a new species. Say a new species needs to consist of 10 or more individuals over at least 3 generations. This will reduce the noise of genome diversity to at least an indication of a viable species.

Offline EricL

  • Administrator
  • Bot God
  • *****
  • Posts: 2266
    • View Profile
Auto-Species forking algorithm issue with big berthas
« Reply #2 on: September 21, 2008, 01:25:04 PM »
The problem is not in recognizing when a species has become sufficiently diverse to justify splitting into multiple species.  The current generational and genetic distance options provide sufficient basis for that I beleive.  Rather, the problem is how to effeciently group the extant members of the old species into two (or more) new ones once you know the old species is sufficiently diverse to justify it.

Unfortunatly, the algorithm to determine the optimal grouping based on generational or genetic distance is NP Complete and cannot be executed in polynomial time (meaning it's really really really computationally intensive).  As a viable alternative, I instead just pick a bot at random to form the basis of the new species and then try to add all the bots closely related to it into the new species.  This should work pretty well in the vast majority of cases.  The problem is with determining what is "closely related".

Imagine the extant members of a species as leaves on a binary tree (yes bots can have more then two offspring, but those births are separated in time and potentially genetic space so the brach points represent births).    If there is any significant generational or genetic distance within a population (say either is > 100) they will be grouped is space based on one or the other (or both).  They must be.   A binary tree with 50 levels (50 generations) starting with just one bot has over a thousand trillion leaves, that is if the tree is balanced.  Our trees won't be due to cancerous bots, asymetrical reproduction, and big berthas, but you get the idea.  Any two extant bots (leaves) on this 50 generation tree have a generational distance below 100.  So you see that if the generational distnace for a species grows large, there must be grouping even if all that means is that a single bot has lived through 99 generations of it's cousins.  

By picking a bot at random, I'm essentially grabbing a leave on the treee, holding it up and letting the others dangle.  The trick is to group all the "nearby" leaves together into the new species and leave the ones "far away" in the old species.  It really shouldn't matter which bot I start with.   If the bots are grouped, then all the other bots in the old species should be either nearby (in genetic and generational space) or far away but not in the middle (unless the species has really split into three new species sub groups in which case, the next forking will handle that).    

I think I actually came up with a solution last night while sleeping (see how I never stop working on DB for you all?).   I already have the code for determining genetic/generational distance.   I'm going to use a value of generational and genetic distance of perhaps 1/3 of that specified in the speciation dialog (I'll play with this) to determine nearbyness relative to the randomly chosen member I.e. what other bots to throw into the new species.   This should handle the long lived big bertha case (big berthas will get isolated out into their own species if they are the one chosen or remain as the sole member of the old species if not) as well as insure that new species have sufficent members and not just a single or small number of individuals.

Note that as much as I would like to provide a threshold on minimal number of members in the new species, determining that ahead of time requires doing all the work to fork the new species anyway.  What's more, using it as a basis for grouping makes the problem potentially NP complete again...

What I can do is predicate the forking decision on a minimal number of members in the old species.  I actually have this implimented already in the 2.43.1M code but not exposed as a user option.   Right now, it's hardcoded as 10.  That is, a species must have at least 10 members to be forked.
Many beers....