Code center > Suggestions
New Mutation Paradigm
Numsgil:
Okay, after looking over Greven's Post and how Avida does mutations.
Here's how the current DNA works (this is important)
Each unit is comprised of a type and a value. Type is:
0 for numbers
1 for *numbers
2 for commands (add, etc.)
3 for conditions (=, etc.)
4 for flow (cond, stop, etc.)
5 for logic (and, or, etc.)
value determines which command or number it is.
Okay, here's a new system I've come up with:
First off, we'd need to come up with a way for the user to create a custom gaussian distribution, where he can decide what values the standard deviation includes (taht is, so he can determine the range for 99% of the generated values).
Mutations aren't limited to reproduction only, nor are they limited to children during reproduction.
1. Point mutations (changing of ~1 command) can occur at any time in a bot's life. It can change either the type or the value of a unit. Which one is effected is determined by a probability slider the user can set. If you set it more towards value, than type is less likely to mutate.
Changes in type always move up or down by 1, or perhaps some user defined value of 1 to X). Modular arithmetic is used, so a type of 5 + 1 = 0. and 0 -1 = 5.
Changes in value are set by 2 custom gaussian curves. One for number and *number, and one for the rest.
The number of places effected are determined by a custom gaussian curve.
During reproduction, much more massive changes are possible.
2. Copy Error mutation. This is just like a point mutation in effect, but it's probalities and distributions are (or atleast can be, if the user so desires) different and only occurs during reproduction.
3. Extra Copy Mutation. This copies a segment of DNA X units long starting at A and inserted at B. A and B are determined as a number between 1 and the genome length. X is a custom gaussian curve.
4. Deletion. Starting at A a segment of DNA X units long is deleted. A is determined randomly between 1 and the genome length. X is a custom gaussian distribution.
5. Reversal. Starting at A, X units of DNA are reversed. That is, if the DNA is 5 6 > start, then it becomes start > 6 5. A is between 1 and genome length, X is a custom Gaussian distribution.
6. Insertion. Starting at A, X units of DNA are inserted. The type is determined by a special 6 way slider between all the types that the user can set. The value is set by a linear distribution of all existing values for that type.
These mutations can affect either parent or child.
Mutations which result in a non existant command will show up as number@number (or some other code), with a toggle to turn off display of all Junk in most DNA viewing windows and recording areas.
The beauty of this is that as we add stuff to the program, we don't have to modify the mutations code. It's all set up to handle whatever we throw at it.
I'd probably create a nested command window, so alot of the custom gaussian curves, etc. are hidden unless you look for them.
On the surface you just set the 6 types of mutations' probabilities. That's even easier than it currently is.
There'd also probabaly be a way to save mutation rates seperately from settings, so you can just import a settings file directly for a new species you've added.
Numsgil:
I know it's alot to read over, but I'm kind of itchy to get started on this one (it's a fun kind of work, since it's redesigning over inventing).
If anyone has any thoughts one way or another...
The only thought I had was maybe using modular arithmetic for all the types. So *1001 = *1. But that might change the way things like numbers are mutated into different type commands and then back.
Carlo:
--- Quote ---First off, we'd need to come up with a way for the user to create a custom gaussian distribution, where he can decide what values the standard deviation includes (taht is, so he can determine the range for 99% of the generated values).
--- End quote ---
Ok, this should be the basis of every correct mutation systems.
But I think this should only apply to immediate values and maybe pointers. I can't understand, for example, why an add command, to become a sqrt command, should pass by a sub and a mult command. Hope you're getting the point. Graduality in immediate values is (often) obvious.
5 .up store should be more likely to change in
6 .up store than in
500 .up store.
(but often is not: there is no apparent relation between -1 .shoot store and -2 .shoot store or 2 .shoot store ... and this is a big problem)
Same thing don't applies to instructions and even pointers. Is aimsx more similar to dx or to shoot? Who knows? Does the question even makes sense?
Another problem. Think a the eyeN values. They go from 0 to 100. Now take the refnrg value. It goes from 0 to 32000. Take the shootval value (memory shots): makes sense from 0 to 1000. Take the up, dn etc commands. They make sense (maybe) from -30 to 30.
This is a BIG problem. Mutations should be able to create new functionalities by mixing the code at random. But what can happen if I take a .refnrg value and put it in .dn? Nothing useful. Because the range of values offered by .refnrg is completely different from that accepted by .dn. What happens if the code
"5 . up store"? is mutated in "5 .repro store"? Nothing useful, again. A repro with a 5 value is nearly useless.
I think that these problem are deep in the structure of darwinbots.
A solution for the second problem could be this one: writing in the sysvars file the range of each var. You'd have
1 .up -20 20
2 .dn -20 20
....
5 .aimdx -600 600
...
501 .eye1 0 100
this way you would be able to normalize each value read or written to a var to an universal range... and have much more meaningful mutations.
--- Quote ---Mutations aren't limited to reproduction only, nor are they limited to children during reproduction.
--- End quote ---
This again it's ok for me. Correct from the biological point of view. The only doubt I have about this is that it will likely introduce something like a maximum life span: because if you have, say, one point mutation every 100 dna positions every one thousand cycles on average, this makes it unlikely for a robot to live for arbitrarily long time (death by aging was one of the possibilities I wanted to leave open to evolution).
But on the other hand, robots could evolve multiple copies of the fundamental genes to avoid this problem.
--- Quote ---Changes in type always move up or down by 1
--- End quote ---
This makes no sense, there is no reason for that. There is no reason for instructions to be "nearer" to pointers than immediate values are.
--- Quote ---During reproduction, much more massive changes are possible.
--- End quote ---
I agree on that and on the mutation types you proposed
--- Quote ---Mutations which result in a non existant command will show up as number@number (or some other code), with a toggle to turn off display of all Junk in most DNA viewing windows and recording areas.
The beauty of this is that as we add stuff to the program, we don't have to modify the mutations code. It's all set up to handle whatever we throw at it.
--- End quote ---
No. I perfectly understand what you mean, but keep in mind that those will always be useless mutations until you insert new commands into the dna interpreter. A useless mutation is a mutation that:
- either leave unchanged the functionality of the dna so that is in fact a non-mutation
- makes the code non executable or strongly disfunctional, so that you'll waste cpu cycles just for bringing the new mutated robot to death
Moreover, you have to keep in mind that you cannot add new commands AFTER the mutations are in the dna, because each mutation has to be positively selected by its effects. Introducing new commands after the mutations are in the dna means changing abruptly the way dnas a re interpreted. And that's no good.
So, I think that the best thing to do is allowing the mutations routines to produce only the actual instructions. Simply code them with numbers going from 1 to max and, when you introduce the (n+1)th instruction, change the max to n+1.
shvarz:
Agree with Carlo on most of things.
One thing that somewhat bothers me is the mutation during life of a single bot. I know it is somewhat related to real biology, but I don't like it.
Numsgil:
You should be able to turn off any mutation types you don't want to play with.
Types only mutating by +-1 was just to help self correcting mutations. That is, there's a 25% chance of 2 type mutations going back to what they were originally.
Anyway, the range of +- should be user definable, so you can set it to whatever you want. The idea is to have so many things be user definable that you can run any kind of mutation rules you want.
You misunderstood my last statement.
--- Quote ---Mutations which result in a non existant command will show up as number@number (or some other code), with a toggle to turn off display of all Junk in most DNA viewing windows and recording areas.
The beauty of this is that as we add stuff to the program, we don't have to modify the mutations code. It's all set up to handle whatever we throw at it.
--- End quote ---
The first section (number@number) means that the mutations code has the ability to mutate any commands at all, even ones that don't exist. Then, when we add new commands, the mutations code within DB doesn't have to be modified by increasing the range of N.
The number@number that corresponds to the new code is coincidence, and I agree that it doesn't make any sense from an evolutionary standpoint. It hasn't been selected for. You'd probably want to eliminate JunkDNA (in this case I mean number@number) from a bot you're porting to a newer version.
Also, number@number lets, say, a 5200 mutate into a 5200@3. Then later mutate into *5200, then maybe back to 5200. It means that type mutations don't modify the value, which is important. 5200 only makes sense as an actual number, so for it to mutate into anything worthwhile it would need to change from 5200 into, say, 9, which would take quite a few mutations of value.
I agree that ranges in DB are rather arbitrary. One solution is to apply modular arithmetic to all commands. So we do like you say and say .up can only be -20 to 20. Then a command of 1041 .up store would be the same as 1 .up store.
That way 32000 still works for different commands. It doesn't solve all problems, but it does solve quite a few.
Navigation
[0] Message Index
[#] Next page
Go to full version