The problem is not in recognizing when a species has become sufficiently diverse to justify splitting into multiple species. The current generational and genetic distance options provide sufficient basis for that I beleive. Rather, the problem is how to effeciently group the extant members of the old species into two (or more) new ones once you know the old species is sufficiently diverse to justify it.
Unfortunatly, the algorithm to determine the optimal grouping based on generational or genetic distance is NP Complete and cannot be executed in polynomial time (meaning it's really really really computationally intensive). As a viable alternative, I instead just pick a bot at random to form the basis of the new species and then try to add all the bots closely related to it into the new species. This should work pretty well in the vast majority of cases. The problem is with determining what is "closely related".
Imagine the extant members of a species as leaves on a binary tree (yes bots can have more then two offspring, but those births are separated in time and potentially genetic space so the brach points represent births). If there is any significant generational or genetic distance within a population (say either is > 100) they will be grouped is space based on one or the other (or both). They must be. A binary tree with 50 levels (50 generations) starting with just one bot has over a thousand trillion leaves, that is if the tree is balanced. Our trees won't be due to cancerous bots, asymetrical reproduction, and big berthas, but you get the idea. Any two extant bots (leaves) on this 50 generation tree have a generational distance below 100. So you see that if the generational distnace for a species grows large, there must be grouping even if all that means is that a single bot has lived through 99 generations of it's cousins.
By picking a bot at random, I'm essentially grabbing a leave on the treee, holding it up and letting the others dangle. The trick is to group all the "nearby" leaves together into the new species and leave the ones "far away" in the old species. It really shouldn't matter which bot I start with. If the bots are grouped, then all the other bots in the old species should be either nearby (in genetic and generational space) or far away but not in the middle (unless the species has really split into three new species sub groups in which case, the next forking will handle that).
I think I actually came up with a solution last night while sleeping (see how I never stop working on DB for you all?). I already have the code for determining genetic/generational distance. I'm going to use a value of generational and genetic distance of perhaps 1/3 of that specified in the speciation dialog (I'll play with this) to determine nearbyness relative to the randomly chosen member I.e. what other bots to throw into the new species. This should handle the long lived big bertha case (big berthas will get isolated out into their own species if they are the one chosen or remain as the sole member of the old species if not) as well as insure that new species have sufficent members and not just a single or small number of individuals.
Note that as much as I would like to provide a threshold on minimal number of members in the new species, determining that ahead of time requires doing all the work to fork the new species anyway. What's more, using it as a basis for grouping makes the problem potentially NP complete again...
What I can do is predicate the forking decision on a minimal number of members in the old species. I actually have this implimented already in the 2.43.1M code but not exposed as a user option. Right now, it's hardcoded as 10. That is, a species must have at least 10 members to be forked.