Post not finished YET!!
This post will be a little technical.
I have criticised the DNA structure and DB in general, and I promised to make up something and here is my proposal. It is not fully finished! But what I want, is to discuss it in detail and hear ideas/critics from all DB'ers, because this is very big thing to begin changing.
My main quest for this were to creat a DNA language/structure that:
- allowed junk DNA.
- not destroying any existing bots.
- making mutations work better and implentment them easier(not saying they dont work now, read on!!!)
- preserving most of DB as it is now.
You must understand that I dont say this is better than the current system and it is [span style=\'font-size:14pt;line-height:100%\']not [/span] perfect, far from perfect, but I like to get opinions, and if you can argue that the current is better, so be it, it is just thought as a suggestion/discussing topic, not a: IF-IT-CANT-BE-IMPLEMENTED-AS-IS-DONT-USE-IT, I am open to constructive critcism.
I will use words like:
- genome/genotype to identify an entire DNA sequence (Bot DNA)
- DNA letter or just DNA as a single instruction (like the store command)
- phenotype as the way the bots ending genome is like (I know this is not what phenotype actually mean, but it is the best word to use right now.)
- DNAS is short for DNA Structure!
I will try to argue for my point of view, but still there maybe things I havent thought of.
The genome and mutations:The genome for a bot now is stored in an array. In my DNAS the genome is made up by a single string!
After a bot birth the genome is read and implemented by the main DB program, it will end up in a array, the phenotype.
Example:
We now have a genome looking like this:
"ABDDhFEKFLjgFHFKDADDFDFEIREUROIEr"(When I mean a single instruction(DNA letter) I mean 'A' or 'B' etc.)
Why:With mutations it will be alot easier. Instead of having a delete gene, insert gene etc., we only need 3 (maybe 4) different mutation:
- Insert
Can insert a single instruction or an entire substring within the genome. This one can have two different functions: inserting instructions already in the genome (and letting the point mutation introduce new instructions) or insert all possible instructions. - Delete
Can delete a single instuction or an entire substring within the genome. - Point
This only works on a single DNA instruction, subtituting it with another DNA instruction. - (Revert)
Works on a substring within the genome, and revert it: example:
we have "ABCDEFGH", the substring "BCD" will mutate and we get "DCB", the entire genome then look like this: "ADCBEFGH"
Why:This opens up for junk DNA or DNA we have never seen before in DB, like a ADD in the condition part of the gene, instructions outside genes!
Conditions within conditions! With also get ride of the peculiar genome with spaces in it, I mean empty spaces, I have seen such in DB. (I do not know if it is fixed, never bother to report it, maybe it was an earlier version cant remember).
And we also get a completly new recombinations. You must have in mind, that the gene it self is not a unit anymore (as it is in some arbitary way in the current system), it emerges from the combinations of the DNA instructions.
The phenotype:The phenotype is the actual behavior or the part of the genome that is executed in the bot.
When a bot is born, the program read through the entire genome and it will decide, through rules we have decide, if there should be a condition within a condition or if it can have a '='-sign within the executed part of the gene or DNA outside the genes should be executed etc. etc. This is all yet to be decided.
If we dont what something (condition within condition) the program just ignores this and only the parts we want gets into the phenotype (the array). The genome is not touched at all, and it is the genome that is passed on to the offspring, but the phenotype that is executed.
It all ends up in a array as it is now.
Why:This is interconnected with the genome. But it allows us the make certain rules about the execution of the bot. Ex. We dont want a store in the condition part etc., the it is not expressed in the phenotype of the bot, but in is still there, able to act as junk DNA.
It also makes the possiblity that the genome and the phenotype is different from each other, which it is in real life, and just a single insert or delete mutation may be the rise to new and interesting species of bots, because of all the junk DNA it is now possible to have.
The DNA it self(All in this section, is mainly arbitary picked anything can be used!)
Because DB arciteture is mainly based on direct numbers in the DNA (as opposed to other AL simulations like Avida, but it is difficult to compared DB with Avida),
I could not write an entire DNA system without numbers and still live up to the goals I did set for this system. Therefore the downfalls of this systems lies in the DNA it self.
A sysvar is actualy only a way to make it more readable and a pointer to a specific number, therefore every bot could be written without sysvars.
Say we now have the letter A-Z (26) and the letters a-z (26) as symbols, for different DNA instruction.
The letters
A-J is the numbers 0-9. The letter
Zis the flow command
cond,
Y is
start,
X is
< and
n is a seperator for numbers, then we could have something like:
cond
10 50 >
start
this will end up into something like:
BAnFAXYI hope you get the idea.
But say we have the following genome (in DB language):
cond
10 add 50 >
start
then phenotype will be (again in DB language)::
cond
10 50 >
start
(If we what no add instruction in the condition part)
The again we could have
a-j is the numbers *0-*9, and we could decide that the first number is what the number actual is, so aBC is *012 = *12 etc.
The downfalls:The main downfall I can see in this system is the DNA. The numbers can change dramaticly without logic, a '0001' could mutate to become 9001, but then again maybe this could help evolution further, I dont know, and it is the natural selection / the evolutions job to find the fittest! And remember that not all mutations are good for the survival of the organism.
This system is not perfect, and if implemented it could endup showing that this system really sucks.
I have relied heavily on a few books I have read about the topic and my own experince with developing AL simulations.
Overall you can still write the old code, and use old bots, because we will creat a small routine to convert the code to the DNA instructions so the program understands it.
Please comment.