Author Topic: Personal Project -- Biogenesis and QLearning (Read 4636 times)

rwill128 · « **on:** June 26, 2013, 08:01:03 PM »

Hey all,

So I hope there's nothing wrong with me posting about another evolution sim program on here. If there is, I'll be okay with someone letting me know.

But anyway, I wanted to share the results of the project I've been working on for the last few days. It finally reached a milestone where I have a working "proof-of-concept" program, and now I want to tweak it so it looks prettier and has lots of interesting features. I've been using Biogenesis (Specifically, a mod of it called "Biogenesis Color Mod" : https://sourceforge.net/projects/biogenesiscolor/) and a QLearning framework I found at http://elsy.gdan.pl/

Check out some of the samples on the Elsy website. He's actually got a nifty little library there. He implemented a type of reinforcement learning that uses neural networks, and the "Wanderbot" or "Apollo Lander" examples both show what kind of tasks it's effective for.

Anyway -- I decided I wanted to combine Biogenesis, which I've always found to be a fascinating program because of its relative simplicity (but cool results), with this other guy's QLearning framework.

In Biogenesis, each creature's lines have different functions based on their color, and each creature's "genes" just store the overall shape of the creature. So more effective shapes are selected for. But movement patterns are largely unintelligent. But now, as of about 5PM today, I've got a working prototype of a Biogenesis mod where each creature is connected to its own (surprisingly effective) neural net, and can make movement decisions based on the information passed to that net.

---

With all the evolution sim enthusiasts around here, I thought I'd come and ask if anyone's experimented with either of these programs before, and if they have, do they have any input on this concept? Any ideas of how you'd like to see it implemented? How should creature's brains be given reward feedback? Should their "preferences" be stored as hereditary information?

Numsgil · « **Reply #1 on:** June 26, 2013, 11:45:22 PM »

Sounds cool; I haven't seen either project before. Figuring out a good reward metric is always the tricky part with any sort of directed learning optimization (whether using a neural net or genetic algorithms or anything along those lines).

rwill128 · « **Reply #2 on:** June 27, 2013, 08:47:36 AM »

I'm thinking a reward strongly weighted toward "Did you gain more health than you would have through photosynthesis over this last click?" would work well.

But I was also thinking about passing a bunch of different parameters to the learning framework, along with a short genetic code that describes which ones to use as a reward (and with what weights). Then just let the evolution of the simulation figure it out.

Could end up with some cool effects... for example... altruistic behavior: a creature that "wants" the health of the organism it sees if that organism has similar color segments. Could be selected for naturally, too, I'd think.

Botsareus · « **Reply #3 on:** June 28, 2013, 12:40:31 PM »

In DB2 the findbest algorithm searches for all children and all childrens children etc. total energy and selects for the best result.
I am thinking you can modify that and select for 'better then average' or 'worse then average'.
If you need a floating point you can do current / average. Better should be > 1 Worse should be < 1

Botsareus · « **Reply #4 on:** June 28, 2013, 06:53:47 PM »

Anyway, that is just an idea for one of the parameters. Let us know when you put it all together. ok?

rwill128 · « **Reply #5 on:** June 28, 2013, 09:43:04 PM »

It's definitely an interesting idea. I think the parameters for behavior reinforcement should all be relative to the creature, though. So something like "this action benefitted my health, let's do it again" type behaviors. Access to global data like that seems like it would be cheating. Maybe that's not true though. I'll have to think about it.

Another thing I like the idea is the passing of several different possible values as rewards, and the use of genetic code in the organism to decide which one(s) will be used. It might be hard to think about it because the "code" in Biogenesis is much different than the code in DB.

But the idea is to allow the possibility for organisms to use a behavior reinforcement model that allows for modular and open-ended results. For example, one of the colored segments is "lilac," and it kills the organisms it touches instantly, but doesn't convert any of the organism's energy to health for the lilac-owning organism. I'd like for the behavior reinforcement model to allow the potential for an organism to develop a penchant for using their lilac segments, while that same creature's offspring might experience a mutation where their behavior reinforcement model encourages them to use some other segment, such as red one, which doesn't kill as effectively, but which converts the organism's energy into food.

If behavior were reinforced according to some arbitrary judgment of immediate benefits to "fitness," the red segment behavior might be more encouraged, but both could actually be "fit" behaviors according to the one true standard: whether they produce another generation.

I'll definitely let you know when it's all put together. I'm excited to see the results!

Botsareus · « **Reply #6 on:** June 29, 2013, 03:08:13 PM »

Yea, unless you want robots to have telepathic ability, it is kinda cheating.

Darwinbots Forum

News:

Author Topic: Personal Project -- Biogenesis and QLearning (Read 4636 times)

rwill128

Personal Project -- Biogenesis and QLearning

Numsgil

Re: Personal Project -- Biogenesis and QLearning

rwill128

Re: Personal Project -- Biogenesis and QLearning

Botsareus

Re: Personal Project -- Biogenesis and QLearning

Botsareus

Re: Personal Project -- Biogenesis and QLearning

rwill128

Re: Personal Project -- Biogenesis and QLearning

Botsareus

Re: Personal Project -- Biogenesis and QLearning