Re: NEWB: How to get started on writing new client?

From: Andrea Arcangeli <andrea_X_at_X_cpushare.com>
Date: Tue, 27 Nov 2007 17:47:12 +0100
To: Nic Watson <cpushare_X_at_X_nicwatson.org>
Cc: Alex Fihman <alex_fihman_X_at_X_mail.ru>, Ed Suominen <general_X_at_X_eepatents.com>, Gregory Maxwell <gmaxwell_X_at_X_gmail.com>, cpushare-discuss_X_at_X_cpushare.com
On Tue, Nov 27, 2007 at 12:19:17AM -0500, Nic Watson wrote:
> Along those lines, I hit across another solution by pure serendipity
> this weekend:
> <http://vde.sourceforge.net/>
> "VDE is an ethernet compliant virtual network that can be spawned over a
> set of physical computer over the Internet."

Yes, I also noticed this could be of help, I need to look into it.

btw, the project leader is my ex professor of operative systems, he's
a few kmt away from my place and I know him well.

However in the last days I took inspiration from the above and I
thought in the long run, I could use the tun model and create my own
p2p routing infrastructure to create an encrypted virtual network
(perhaps using twisted) as transport with udp packets (kind of openvpn
but without requiring all traffic to pass through the buyer computer
like openvpn, and with plain ssl encryption, with attestation provided
by the CPUShare servers, all originating from the ssl private key of
the cpushare protocol). I wonder if VDE already allows something like
this and perhaps I could modify it later to achieve this. In the
meantime I could start with VDE if it at least solves the tcp over tcp
that "-net socket" has. TCP over TCP is really bad, all packet traffic
should be udp (or without virtual interface at all).

The idea is that if any of the seller has incoming ports open my p2p
KVM tun virtual CPUShare network would then be able to connect two
sellers together like a switch (not like an hub that sits on the buyer
computer and requires all traffic to be routed through the buyer
node). This may be helpful in increasing performance for some
application and it would be transparent to the application. The
application just pings the "ip" of another KVM virtual machine, and my
p2p protocol decides if to send the udp packet (encapsulating the icmp
packet) directly to the destination IP open port, or if to proxy it
through the buyer node. Then I can add a switch to the buy order to
filter away sellers that can't accept incoming udp packets, that way
one seller can create a full interconnected network over the internet,
optimum for neural network/AI.

This in the long term sounds the best design to maximize network
performance.

The major trouble in all this, compared to using the "-net user"
model, is that to open a tun device (something that even VDE requires)
I need root privilege. At the moment the CPUShare client requires no
root privilege anywhere. OTOH to open /dev/kvm it'll also require some
special privilege (even if not necessairly root) so it'll be more
tricky than seccomp from a packaging point of view. But frankly my
only priority now is to make life as easy as possible to the
buyer. Not taking it as total first priority initially was because I
tried to build the most secure thing possible, but it was too hard to
use. I don't regret starting with seccomp because if something goes
wrong later, I know I tried the right thing first and I verified it
not to work for most buyers, so the secure thing is effectively
useless.

> One issue is that, if CPUShare initial images are anything like EC2's, a
> 300MB image is not out of the ordinary.  That would take over an hour
> for my meager DSL connection to download before I even get started on
> actual work.

The buyer will shut himself in the foot if he doesn't shrink the
image, 300M is a lot. Keep in mind the current CPUShare livecd is 6M
and it has a full python interpreter + twisted + openssl inside
it. But the upload time isn't accounted in the transaction time, and
then the image is cached on disk. So it may work even at 300M size
(that would be a kind of windows client/server software). I really
want windows to run inside CPUShare too, all liability risk for the
license goes to the buyer as specified in the user agreement, but it
has technically to run, as well as freebsd etc... A much bigger
problem is the livecd running on tmpfs, where the size of the iso will
eat into the ram of the system... I'll have to remove the iso size in
the "ram" parameter of the sell order in such case.

> Yep.  This has an interesting exception that if all CPUShare images are
> LiveCDs as .isos, there is no hard drive to emulate, and no snapshotting
> to worry about.  This has the advantage of preventing a hard drive DoS
> attack, for example, by the image setting the entire hard drive as swap
> space and writing back and forth from swap a lot.

They're likely going to be all readonly livecd with temporary local
storage in tmpfs and persistent storage in nfs mounted from the buyer
node. nfs will be ok if a node goes down abruptly.

> Well, you're more convinced than I am that virtualization is the right
> answer, and I'm a potential (if not large) buyer.

You're not the only one asking for pure virtualization and wondering
how to modify your application. You're a very skilled potential buyer,
but I'd like anyone to be able to run his application without having
to be a programmer, without having to change language etc... I know
other buyers walked away when they heard about seccomp, and everyone
of those buyers could have contributed to make CPUShare successful.

> I think one of the issues is trying to figure out what a typical buyer
> wants in a utility computing environment.  You need a killer app to get
> buyers to start using CPUShare.  For example, getting distcc running

Agreed. I think the killer app ported to seccomp, will work _best_
with seccomp. Infact I can add seccomp back later into the system so
the killer app can be implemented as efficiently as possible, but it's
now quite clear to me, that I can't successfully startup CPUShare by
requiring people to modify a single line of their cluster
software. And I now feel that the killer app will materialize a lot
_after_ I successfully started seccomp. I initially hoped I could
depend on the killer app to materialize some day and to add KVM
_after_ CPUShare was successfully started, but it's quite clear I got
the priorities wrong.

> would have non-programmers (especially Gentoo users) going as buyers.
> You'd have to bill in minute increments, but I'm think lots of people
> would spend money, or at least trade CPU time, to speed up their builds.

I'm not entirely sure that gentoo users are the ideal user of this,
I'm unsure if they should trust bytecode coming from seller nodes
without at least compiling it twice. But the good thing is that with
the KVM model, even gentoo users would be able to use it in theory if
they bother to create the livecd that matches their current compiler
binaries. The idea is that creating an .iso image with gcc and distcc
client inside, doesn't scare people as much as having to change gcc to
only call read/write syscalls.

The model used for the KVM networking will affect this a lot though,
for something like gentoo to work, the distcc server should be
configured to work on the virtual interface, and the virtual interface
should be available live on the buyer host (not in any buyer guest),
so not like "-net socket" where the virtual interface is only
available in the guest running in the buyer node. Well, the guest
could always route traffic from "net socket" interface to host, using
a second interface visible to the host. But it'd be way too tricky to
start a guest for "emerge" to run faster. I'd be tricky enough already
to run "cpushare start/stop" before/after starting emerge. For maximum
caching of the livecd the livecd would better be shared for each
gentoo gcc release.

> Actually getting distcc going would succeed it achieving my minimum
> requirements:  handling stdin, stdout, stderr, logging to a file, and
> sending data to a single IP address.  Also, billing by minute increments
> and filtering sellers by IP latency would be nice-to-haves that distcc
> and my application would benefit from.

Filtering sellers by IP latency is already available. It'll be less
reliable measurement later because of the p2p behavior though. But
real slow links can be filtered out trivially with that.

You want to buy in minute units, that's going to be tricky, the book
will change every minute so you'll get a lot of disconnects and I'd at
least need to avoid disconnecting a client if you buy from the same
client again.

If something I'd need to charge you only the percentage of the hour,
for now the accounting is so much simpler if I never go below the
maximum granularity allowed by paypal. It'd be an implementation
nightmare to start accounting with euro granularity lower than
0.01. That has to be deferred. I won't require a single change to all
accounting, worldwide invoicing code, website etc... all I have to
change is in the cpushare protocol server side, and in the client.

> I agree with your analysis, except the part that qemu is too slow for
> anything real.  For guest user space code, and a non-wacked guest OS,
> there's no overhead for pure algorithmic/no-system call code.  I think
> most buyers are going to have that kind of code.  It would be an
> interesting exercise to benchmark Jack the Ripper under QEMU.

You mean with kqemu right? With pure qemu JtR would be 10 times slower
or more, dyngen is too slow not only in kernel space. Yes with kqemu
JtR would have nearly native performance because the whole point of
kqemu is to avoid running dyngen on userland. I'm just not sure if it
worth supporting the old systems that misses native virtualization
support. I'm also feeling a bit safer with real shadow pagetables,
real tlb etc... in terms of memory protection, instead of software
tweaks.

> I don't believe that's true.  While all the targets that KVM runs on
> support 64-bit, most of the time, folks are running 32-bit OSes on them.
> 
> See this for KVM support:
> 
> <http://kvm.qumranet.com/kvmwiki/Guest_Support_Status>

I'll entirely be up to the buyer if to create a 32bit or 64bit livecd
to boot in CPUShare. As long as the application binary and application
OS is x86 (be it 32bit or 64bit) it'll always run fine requiring no
change to the binary or OS itself. That's a design requirement for
CPUShare by now.

But the seller host will have to be an x86-64 64bit OS running on a
virtualization enabled CPU.

> I think this is a serious limitation.  I assume that most sellers are
> Windows users running CPUShare in a VM while using their computer.
> Requiring dedicated hardware for CPUShare, at least at first, IMHO is
> asking a lot.  It would suck to make the buyers happy and drive off all
> the sellers. :)  Now, this might not be a permanent issue since I hear

I think here it's really only a matter of making the buyers happy. The
moment lots of buy order showup and stay up, people will buy new
machines to run faster with KVM, switch to linux or it will figure out
a way to boot the livecd in some windows virtualization. Or they'll
hack the livecd themself to add the kqemu module ;)

I think there will always enough supply the moment they start making a
profit by running CPUShare, it's just that today they sit on the
sidelines because there are no buyers.

> that Virtual Box supports providing VT capability to guest OSes, and of
> course the Blue Pill hypervisor supports the same thing:
> 
> <http://www.darkreading.com/document.asp?doc_id=130663>

Then of course the KVM livecd will run too.

The only way out of this would be to fallback in kqemu if KVM fails to
load. I'll make that change the day the buy orders eat into the whole
supply available and the idle mflops of CPUShare reaches 0 ;)

Well in practice this is going to be a two liner change to add kqemu
to the livecd and fallback into it.

But really if CPUShare grows, somebody (not me) should write a real
windows client for CPUShare, keeping booting livecds with
virtualization inside virtualization isn't too nice even if feasible
in theory.

I think I can defer the kqemu or not-kqemu issue for later. It's a
small change in implementation terms to support kqemu.

> I agree that it will be much easier to measure LiveCDs.  You'll have to
> insert your own process before the kernel load in the boot chain,
> though.  You won't be able to trust a measure made after kernel boot.

There was a knoppix livecd at some OLS presentation that implemented
that, so I think I can share whatever they did since I expect this to
be open source.

> > In short the only downside compared to a real cluster will be that the
> > seller nodes will go down at any time requiring fault tolerance in the
> > application managing the cluster and running in the buyer hardware,
> > and second that there isn't any local storage.
> 
> These are both reasonable assumptions to accept in grid computing.

I hope so! In the end, that's the only difference the buyer
application will see between a real cluster and a CPUShare cluster.

At least if an app can't deal with it, modifying it to deal with it
won't be a cpushare specific change at all, and it will benefit the
project at large.

> Well, I think I'm capable of placing a small order if I figure out how
> to get a small example going.

Thanks! The only question is how hard is to port your app to
seccomp. Because I didn't implement the malloc/free emulation for
seccomp, nor the filesystem open/read/write/close that would forward
the I/O to the buyer-storage, it may be hard. If I did that, it would
be less hard. I thought the community could take care of this because
it's entirely a client issue that the buyer himself can solve, and I
still think it will happen, but only after I startup CPUShare.

If you manage to port your app to seccomp route, then be sure I won't
drop the seccomp protocol out of the server.

Passing through seccomp was a required step, making CPUShare fully
feature complete (at least in theory) was a great milestone for the
project and a great proof of concept. Computing performance is already
impressive with the current model. I physically proven CPUShare can be
useful. Now we need to make life easier for people to use it. Rome
wasn't built in a day.

Thanks!
-- 
cpushare-discuss_X_at_X_cpushare.com mailing list - http://www.cpushare.com/
To unsubscribe, send mail to cpushare-discuss-unsubscribe_X_at_X_cpushare.com
Received on 2007-11-27 17:47:14

Click here to return to to homepage.

Search CPUShare Discuss

Disclaimer: the messages posted here are under the sole responsibility of the poster: cpushare.com is publishing mailing list messages in real time while storing safely all the logs containing the relevant IP addresses, timings and mail hops. If you find anything not appropriate in these messages please send a notification through this form. Thank You.

CPUShare Discuss has been converted to html using hypermail 2.2.0.