Darwinbots Forum

Code center => Suggestions => Topic started by: Numsgil on October 08, 2005, 10:25:34 PM

Title: Distributed Programming
Post by: Numsgil on October 08, 2005, 10:25:34 PM
The way I see it, there are two theoretical ways to have Darwinbots operate cross-computers:

1.  Additional computers (nodes) means more virtual space for the bots.
2.  Additional computers (nodes) means more porcessing power(speed) for the simulation.

1 is relatively easy.  We already have it.  What about 2?  Has anyone studied distributed programming at all?  Can Darwinbots even be used in such a way?
Title: Distributed Programming
Post by: Zelos on October 09, 2005, 02:09:46 AM
sounds sweet, no I havent read about it, db can probebly be it, seti have it like that
Title: Distributed Programming
Post by: Numsgil on October 09, 2005, 02:27:38 PM
Did some research, and problems must be "parallelizable" to really get computational increase.  That is, the problem must be able to be broken into steps that aren't dependant on each other.

Cycles taken as a whole are not parallelizable.  But the big computational things (like what bot can see what bot) are.

So there are limits to the increase in computational speed you could get with a cluster, but you could get some.  Having two computers working together would probably have significant increase.  Could also have the computational load on each computer diminish, so have a fast simulation happening without causing alot of slowdown on the individual computers.

I'll look into it after I move some code into C/C++  (more on that in another topic).
Title: Distributed Programming
Post by: Botsareus on October 09, 2005, 07:53:51 PM
Quote
Additional computers (nodes) means more porcessing power(speed) for the simulation.

You know not too long back I was thinking build a poor mans super computer on the enterent. Its a Per-to-per super computer. Everyone can have a bite out of the virtual super computer witch is created from reguler computers. People with faster computers can contribute to start the thing going. The only problem with it is: A prossesor works  2ghz 64bit this days , an enterent connection on avrage is 128k per seocond. Writing to a hard drive is faster then downloading, were is the logic here?

Btw:

What is going on in the  exsisting enterent sharing stuff db has right now? Any news? Cool simulations? I have a feeling people don't use it.
Title: Distributed Programming
Post by: Numsgil on October 09, 2005, 08:59:56 PM
And thus we see that you haven't researched anything at all ;)

Wikipeida -> Beowulf Cluster.

The DB FTP is down since I don't have internet at the moment.  What I need to do is create an autoload at start, run in idle mode version of DB that can be run automatically so the user doesn't have to think about it.
Title: Distributed Programming
Post by: Botsareus on October 09, 2005, 09:20:06 PM
Anyone interested in running there windows science stuff faster should try:

http://www.cnsi.ucla.edu/beowulf/access/default.htm (http://www.cnsi.ucla.edu/beowulf/access/default.htm)

Its eather a dream come true, or only elusions.

That includes DB as well,


It is too good to be true, you need to get a licence to use this stuff.

Ah, this might work: ftp://ftp.ssh.com/pub/ssh (http://ftp://ftp.ssh.com/pub/ssh) get the new virsion. Don't know how it works or what it does yet, find out.
Title: Distributed Programming
Post by: Numsgil on October 09, 2005, 09:36:42 PM
Glad to see your paying attention.

You can't run VB projects in non-windows environments.  (Beowulf clusters are by definition linux/unix)
Title: Distributed Programming
Post by: PurpleYouko on October 10, 2005, 10:13:57 AM
Quote
The DB FTP is down since I don't have internet at the moment. What I need to do is create an autoload at start, run in idle mode version of DB that can be run automatically so the user doesn't have to think about it.

Why not just use a folder on the Darwinbots.com FTP site to store the files.

Even better, use the SQL server.
Title: Distributed Programming
Post by: Numsgil on October 10, 2005, 12:55:33 PM
Because the real FTP only allows one user ID and password, and isn't set up for anonymous uploading...

Some sort of database php thing would probably work, but I know little more than truly basic HTML.  I wouldn't even know where to begin.
Title: Distributed Programming
Post by: Ulciscor on October 10, 2005, 01:33:55 PM
Maybe I am being slow but can you explain in more detail what you want? I might be able to help...

B)
Title: Distributed Programming
Post by: PurpleYouko on October 10, 2005, 01:54:51 PM
This is just for uploading and downloading files for use in organism sharing.

In case you haven't used this facility in DB it allows the program to periodically log on to a database and download/upload a robot text file.

In the past this has been done by connecting a computer to an internet connection then accessing its IP address in order to move the files. Over the summer it was connected to a computer that Nums set up for this purpose. Previously Carlo had one set up in Italy but the connection was really slow.

What I would like to do is to set up a database such as SQL or maybe even Access which would sit on the Darwinbots (or some other) site and hold a bunch of DNA files. Each copy of the DB software would then act as a client and send queries to the database to either upload or download the text files. (Note: Downloading requires the removal of the file from the database or else one organism becomes many through a cloning process)

Make sense now?
Title: Distributed Programming
Post by: Ulciscor on October 10, 2005, 02:02:26 PM
That shouldn't be too tough at all. It could be set to check the response to see if the upload/download has worked.
Title: Distributed Programming
Post by: Numsgil on October 10, 2005, 02:22:51 PM
Good, set something up and I'll work out the Darwinbots programming details.
Title: Distributed Programming
Post by: Ulciscor on October 10, 2005, 02:26:39 PM
Access has a max field size for text so are you going to post the DNA as binary or a file?
Title: Distributed Programming
Post by: Numsgil on October 10, 2005, 02:30:10 PM
I think DB saves the organisms as .dbo (organism files).  This includes things like nrg, body, other bots (in a MB), etc.
Title: Distributed Programming
Post by: PurpleYouko on October 10, 2005, 03:32:48 PM
In an access file you can have separate fields for all those things. The text file for the DNA can be split up into individual genes easily enough.
We should be able to send all this stuff in a single query.
Title: Distributed Programming
Post by: Numsgil on October 10, 2005, 04:08:57 PM
But remember you don't necessarily need the files stored online to be readable to humans, since the only thing making queries and downlaods/uploads is the program.

Just food for thought.
Title: Distributed Programming
Post by: PurpleYouko on October 10, 2005, 04:47:49 PM
Good point. We could easily use a detokenized code based on the the way the DNA is stored in the program. It would just need to be a stream of numbers.
The program can easily read it back into the right format.
Title: Distributed Programming
Post by: Numsgil on October 10, 2005, 04:58:58 PM
Which is exactly what the .dbo files do.  Isn't it nice that Carlo thought ahead for us :P
Title: Distributed Programming
Post by: Ulciscor on October 11, 2005, 08:52:50 AM
So how is this going to work? One user's program uploads the .dbo of a bot and what other info? Does the upload get overwritten when the user sends again? How are the uploads sorted? Do they expire after a time?
Title: Distributed Programming
Post by: PurpleYouko on October 11, 2005, 09:24:10 AM
The uploads should stay on the server until they are downloaded again, then they get removed.

The old way just wrote the DBO files directly to a folder on the server PC then went in and chose a random file to download. After that it would wipe the file that it downloaded.

The database method should work the same way.

The server should end up being a repository of hundreds of DBO files at a time. Downloads and uploads are random

The old way was really slow. I would like to see all the clients physically logged into the database whenever a sim is run with internet sharing enabled. Uploads and downloads would then be pretty much instantaneous so true multi-player sims may become possible.

Even better if we have different "channels" that can be selected in the DB program. These would point to different tables in the database.
Title: Distributed Programming
Post by: Numsgil on October 11, 2005, 10:32:22 AM
That's pretty much what I'm thinking.

I'm not sure that a 100% upload->download0>delete cycle is best, since users can download some organisms and then just turn off their computer.

I can see some griefers messing with several weeks worth of work.

So maybe periodic saves are produced which can only be downloaded, that contain all the bots on the database at a specific time.
Title: Distributed Programming
Post by: PurpleYouko on October 11, 2005, 10:58:17 AM
or another way to work it is that a client uploads 1 then is only allowed to download 1 if more than a certain number (say 100 or more) are present on the server.

We can also require that an upload happens before a download/deletion is able to occur.

A further option is instead of full deletion of the database entry, it could be transferred to a shadow database like the deleted messages in Outlook. That way a strain won't be lost forever if someone does as you say and turns off after taking the last one.

Anyone with a copy of Outlook or even Excel can set up a "copy" link to the database and read files out of it at any time. We just restrict writing and deleting to in-game activity.
Title: Distributed Programming
Post by: Ulciscor on October 11, 2005, 12:22:08 PM
Hmm this might get awkward.
Title: Distributed Programming
Post by: PurpleYouko on October 11, 2005, 01:16:35 PM
Why might that be pray tell?

I do this kind of stuff with SQL servers at work all the time. Just never tried doing it over an internet connection.  :wacko:
Title: Distributed Programming
Post by: Numsgil on October 11, 2005, 01:36:34 PM
Would also like to point out the other sort of griefing:

Suppose you're trying to evolve a specific bot.  Say animal minimalis.  And some jerk/newbie unleashes One (with mutations disabled mind you) into the database.

Doesn't take a genius to figure out what's going to happen.

Worse, what if some jerk copies One's DNA into a file called Animal Minimalis?  Then the program can't even do a quick check to determine if the fname is right.

I see periodic (say, daily) backups as the best solution.  If something goes wrong, then everyone just reverts to the backup.

I dunno, there are some problems there too.  It's sort of a huge population crash...

If anyone has any ideas for this, that would be swell.
Title: Distributed Programming
Post by: Botsareus on October 11, 2005, 01:45:59 PM
Well if you are doing a Distributed Project, don't the starters select the users who would join them. Don't the exsisting users modulate new users?
Title: Distributed Programming
Post by: PurpleYouko on October 11, 2005, 01:54:55 PM
We aren't talking about a distributed project.

We are talking about run-of-the-mill Darwinbots programs uploading/downloading organisms to a centralized database.

Anybody who downloads and runs DarwinBots would by default have access to this feature in the software.
Title: Distributed Programming
Post by: PurpleYouko on October 11, 2005, 02:03:57 PM
We had that kind of trouble way back when I discovered tie feeding was possible in V2.1

I published the first version of Devincio Venator to the forum with a warning that he was highly prone to crashing the program by tie-feeding more than 5000 energy per turn (this has since been severely capped)

Within a few days I began to recieve my robot from the server with somebody elses name on it.

As you say Num, some asshole is likely to take an innocuous name and fill it full of the deadliest DNA possible.

Possibly the client program can have built in filters added which will clearly label different types of bot before uploading them, then we can allow downloads to be filtered for these labels to prevent bringing something like "The One" into a delicate eco-system.

There are certain bits of DNA code that are easily recognizable by the program so they could be automatically labelled this way.
Another possibility is to keep a constant efficiency rating with each bot file. A kind of profile based on highest feed rate, energy expenditure rate and so on. Then we send this label with it to the database.
Title: Distributed Programming
Post by: Numsgil on October 11, 2005, 02:12:35 PM
Perhaps what we need is a sort of metric:

Some function assigns a value for the distance between two robots' DNA.  Distance functions only have to fulfill the following:

distance of A to B is distance of B to A
distance from A to A is 0
distane from A to C is <= distance from A to B + distance from B to C

So it doesn't have to really do anything at all with how we normally think of distance.  It's just a measure of the difference between two DNAs.

Then users set that they only want to download bots that are within a certain "distance" from some start bot, that is, within the "neighborhood" of some start bot.

The smaller the neighborhood, the less you're allowing dramatically different bots into your simulation, but the less likely you are to get some incredibly amazing bot who's DNA is dramatically different from its ancestor.

Perhaps the neighborhood applies to the most populous bot in your simulation.  Or maybe it applies to the mean of the bots' DNA's.  Then you're just excluding incredibly unlikely major "jumps" in a bot's DNA.

I'd just have to think of how to construct such a metric.  Preferably it'd have the distance between two DNAs be very little when, say, the DNA of one is just rearranged compared to the distance of the other.
Title: Distributed Programming
Post by: Ulciscor on October 11, 2005, 02:17:03 PM
Quote
Why might that be pray tell?
Well here's the reason.
Title: Distributed Programming
Post by: PurpleYouko on October 11, 2005, 02:41:27 PM
OH Crap!!!!

Ulc has been possessed by Bots  :huh:
Title: Distributed Programming
Post by: Ulciscor on October 11, 2005, 02:47:15 PM
What do you think about possessed?

Lol just kidding. I meant that the complicated bit is the filtering of bots and when to delete entries.
Title: Distributed Programming
Post by: Numsgil on October 11, 2005, 02:47:56 PM
Yeah PY, isn't that obvious from the post?