main >
some notes on distribution
some notes on distributionWork in progress here... Presumably the starting point is to work with what we have. But I'm listing alternatives here just so I have them all to hand.
We're a windows-only shop, right? Shame, because a lot of the free stuff out there
is unix-oriented. Question: are we going to be a big old compute cluster, or is all this stuff going to be shared across the org later on in a more grid-like fashion? application-level toolsThe following is a rough taxonomy, and lines are blurred - for instance, the GrADS project is attempting to develop a compiler to generate Grid-aware code that will select appropriate execution models at runtime. programming models
application execution environments
middleware of potential interest
timeline1998: Condor starts. 1991: PVM version 2.0 released. 1993: PVM version 3.0 released. 1996: Globus starts. It eventually emerges as a de facto standard for Grid middleware. 1999: PVM version 3.4 released. 2000: Globus Toolkit 1 released 2001: The Global Grid Forum (GGF) is formed, and eventually becomes a focal point for standardisation via the Open Grid Services Architecture (OGSA). Globus is co-operating with the OGSA intiative. 2002: Globus Toolkit 2 released
2003: Globus Toolkit 3 released 2004: PVM version 3.4.5 released.
2005: Globus Toolkit 4 released As far as major vendors go:
too many standards!
architecture stuffworktools to sort out sooner rather than laterSoftware:
Bring:
basic requirementsThe first thing for me to understand is the problem(s) we are trying to solve, together with relevant context in the organization. Stuff includes, but is not limited to:
operating systems, getting updates outRemote Desktop is not enough. We need to be able to upgrade everything across '000s of machines, no matter what kind of state they're in. This has got to be doable no matter what's running (or not running). The less of a custom solution here, the better - this surely must be something that's been solved elsewhere, and solved well. Can we reboot over the net off a stored image, and just switch the images - or something equivalently easy? See Leveraging HTC For UK eScience with Very Large Condor Pools, though we'll need something (a) larger and (b) more dedicated. topology, scalability, etcDo we want stuff on subnets in pools ("farmlets")? Probably. Event logging and job packages would then best be rooted via the pool topology. How big should a pool be? Depends on your I/O. Here's the Winsock FAQ Page on I/O Strategies. They recommend Overlapped I/O for maximum throughput (Winsock 2 only, not that that should be a problem). There's a code sample here that aims for 2000+ clients with only 4 threads on a dual processor box. That's more like it! As far as ACE goes, there is an article on artima that contrasts some possible approaches and recommends the ACE Proactor since that's the one that can use Windows overlapped I/O (the article itself suggests an approach where trying to write a portable solution that can bridge reactor/proactor approaches). Further reading:
job allocation and pool boundariesAre we going to allocate jobs across pool boundaries, with some newly spawned process in charge of checking for completion? We'll get fragmentation problems a la heaps if so. distributed event loggingGood place to start is the ACE Logging Service, which is documented on the ACE Network Services page. Do we need anything more than that? Can our topology handle whatever we want, or do we need find another way to treat event messages - TCP? UDP? Tibco? deploying test job binaries/support filesIt needs to be fast and simple to deploy test job binaries across many machines so we can check things out. This is as distinct from the real low-level stuff we have running all the time. Otherwise it'll be too hard for us to test different approaches. high-speed networkingGetting technical. How much effort to put into this one up front? Or will it be a subject for experimentation? See the Supercomputing 2005 Bandwidth Challenge Results for a snapshot of current work in this area. commercial price points
notes on various toolsWorth spending some time on the Windows Cluster Resource Centre at Southampton Uni.
Sadly, Globus appears to be java or unix-c only
(globus toolkit 4.0.1) -
though apparently there was a WinXP port attempt in 2004 - java "cog" (community grid)
kit is probably the best option
(download here).
Condor has been around for ages, has thousands of users, and the originators have been using it for 1500+ workstations for years. Supported on WinXP/2000. However, Globus and PVM support isn't yet implemented on Windows (see here), and it's not open source, (although it does have its own public license). Also, the job submission is coarse-grained (executable and all data for every job).
bsp is not a hive of industry, seems pretty dead compared to the PVM-MPI coalition.
pvm worth checking out - however
need extra stuff to kick off processes on remote machines, need
winrshd,
ataman rshd or similar. Not expensive,
eg winrshd only 3,500 USD for sitewide license.
mpich2 is also
worth a look, need to try it on a .net-enabled site, don't know for sure how
it creates the daemons, but one assumes an rsh-alike is required for this also.
Note that the MPI site also contains mpich-g2, a globus2-enabled implementation
of mpi.
Then there are commercial routes, most obviously
data synapse.
Also, Microsoft's Windows Compute Cluster Server 2003: Beta 2, for the brave/foolhardy. 64-bit only, don't know if that's likely to be a problem. Top |