Distributed Systems

(A very basic introduction of real world examples)

A presentation given to CA1 class for Topics in Computing module, 8th November 2002, @ 10am-12. Send any feedback, suggestions, questions, flames etc. to me at kpodesta(at)computing(dot)dcu(dot)ie

Dr. Martin Crane - webpage | PowerPoint slides
Karl Podesta - webpage | PowerPoint Slides



There are over 450 desktop computers in the School of Computing. 90% of the time, these machines are doing nothing. If you think about it, the vast majority of the machines (when used), are used mostly for word processing, web browsing, downloading, etc. Not exactly stuff which grinds the processors or anything. And at night time none of them are used at all! This means that there is a huge amount of processor power going to waste in the school.

If this is inside the School of Computing in DCU, what about outside it? Accross DCU, or Dublin or the rest of the world? There is undeniably a huge amount of un-harnessed power to be taken advantage of. At the same time there are loads of really tough problems which ordinary computers can't solve. A lot of the more popular Distributed Systems work on these two premises - systems like SETI@home or RC5 split up difficult tasks and tough problems(!) to run on home PCs during screensaver time or when the processor is idle, and then collate all the information they receive back.

Distributed Systems are traditionally based on the premise that many hands make light work.

Types of Distributed Systems

Clusters

Groups of PCs (ordinary or specialised) brought specificially together to work collectively on problems.

Grids

Clusters can be combined to form a "Grid", a system of massive collective computing power which is designed to be easily used by "plugging in" to it.

Peer 2 Peer

A system whereby individual users or nodes can communicate with each other by themselves. Examples of such a system would be Napster.

Others

The WWW is a distributed system! (of information). It is actually peer to peer, but is worth mentioning seperately as a good example. CORBA can be used to create a distributed system of programming objects, almost like a distributed developer system.



The TeraGrid


Watch the TeraGrid Flyby (24.5 Meg download, MPEG)

A system which they are building in the US will bring together the country's most powerful supercomputers into one big "Grid" computer. Kind of like the ESB's national power grid. Only for computing rather than electricity, and much bigger... and not in Ireland.. and uh, totally way better.

At the moment there are 5 different sites involved (ANL, NCSA, Caltech, SDSC, and Pittsburgh), but they plan to have as many as possible.

The GRID idea is not new, there are many GRID projects around the world to combine supercomputers and clusters accross different sites. The focus has mostly been on computation though, the TeraGrid wants to go a step further and include many other computing resources, like sensors, visualisation equipment, etc. In essence they are not just trying to crunch calculations, they want to do better science & research on the whole, using computers.

So this is a different kind of distributed system - the "work" that it is doing is more detailed than simple number crunching, but it is still a distributed system because the work that the system does is distributed accross different sites and machines. (sounds a bit obvious, but the distinction is there to be made).

What are Distributed Systems used for?

Cracking Security - Competitions are set every so often by security vendors like RSA to try and decrpyt various messages in the shortest amount of time, with a large cash prize for the winner. To do this, there is usually a "key" or keys which unlock the message and allow you to read it, the work consists of calculating literally billions of different keys to find out which one is the right one. distributed.net have just completed RC5, which had to search through (at most) 2^64 (18,446,744,073,709,551,616) keys to find the one that properly decrypted the message! People from all over the internet offered their spare CPU cycles to crack the code, some teamed up in groups to try and give themselves a better chance of being the ones to find it. Learn more about RC5 & distributed.net >>

Particle Physics - The largest particle accelerator in the world is at CERN, in Geneva. Here, physicists from all around the world study the beginnings of the universe, by jamming particles into each other at dizzyingly fast speeds and observing what happens. The LHC (Large Hadron Collider) is their newest accelerator, currently in construction, and will produce a tidal wave of information which needs to be processed and analysed. Over 10 million Gigabytes of data every year! (that's about 20 million CD-roms). As a result, CERN will be one of the biggest users of a distributed system anywhere in the world and is looking at GRID technology (a distributed system!) as the only way of doing it's work. Learn more about LHC Grid computing @ CERN >>


A cluster at CERN, Geneva

Bioinformatics - The Human Genome Project has the job of identifying all genes in human DNA (approx 30,000), and to determine the sequences of the 3 billion chemical base pairs that make up human DNA - the explosion of data that this project has caused is massive! Distributed systems are needed to both store data (in specialised databases), and process this data in all it's tedium to try and find the needed information. Within the broader Life Sciences too, distributed computing is being more heavily applied; to find cures for diseases amongst more than 1 million proteins which regulate bodily function, and even to study these proteins in more detail through simulating changes in their structures (which is called protein folding). Learn more about the Human Genome Project >>

Weather, Climate, & Geography - Linux clusters are often at the heart of ordinary weather forecasts; a job which needs to process satellite data, examine and match up patterns, and produce forecasts based on these patters for anything from a few hours in the future to years in the future. The Microwave Limb Sounder team (MLS) at NASA's Jet Propulsion Laboratory (An MLS basically detects naturally occuring microwaves in the Earth's upper atmosphere) are hooking into the TeraGrid to use the resources to battle Global Warming, which examines weather data in the long term. Learn more about Weather Forecasting and IBM >>

Economics & Finance - Analysis & statistics have always been a big part of helping to make economic & financial decisions, examining portfolios, risk management, money, and even the stock markets - there is a huge base of financial & economic data which is starting to be examined by clusters & distributed systems.

Visualisation & Graphics - The film Toy Story (the first one), used 117 Sun SparcStations to draw (or "render") all of the 114,000 frames which made up the 77-minute long film. One computer on it's own would have taken 43 years of non-stop computing to do this! The process of computer rendering takes objects in a scene and simulates how light will hit those objects. It can then calculate what colours etc. to use in drawing the picture. There are all sorts of physicsy models and stuff involved, so this is quite a lengthy process, shown by the Toy Story example. Ordinary desktop computers can render stuff pretty fast nowadays, but as more complicated effects are brought into play, the computer takes longer to model these. Hence the need for some sort of distributed computer to divide up the work. Learn more about Toy Story's system >>

The film Toy Story

Scanning for Alien Life - The Search for Extra Terrestrial Intelligent life (or SETI) needs to examine mountains of data, from telescopes, satellites, radio waves etc., looking for patterns or signals which could tell us that we are not alone in the universe! (ooOOooh, and other scary noises). The SETI@home project allows you to help in this search by breaking up all the data that needs to be looked through, and dishing it out in home-pc size chunks to people accross the world who want to donate their spare CPU cycles to the cause. The freely downloadable program runs as a screensaver (which comes on when you aren't doing anything) and searches through chunks of data for possible patterns. Learn more about how SETI@home works >>

How do they do it?

For clusters and grids in particular, you may be wondering how exactly they can use these systems to solve the problems and do the things described above. In the vast, vast (vast) majority of cases there is no system where you can just "throw" any program at it and expect it to take advantage of the distributed nature of the system. For these cases what is required is to take the problem, examine it in fine detail, split it up as best as possible, and then write a specific program to run on the system.

For different systems, there are different methods and models you can use to both split up your program, and implement it on the system. A very popular model to use is called Message Passing. Split your program up into parts which can process themselves, and then use small messages between these parts to pass or swop any necessary information. A message passing library called MPI is very popular, and is used on the School's Linux Cluster. Basically it is a set of commands which you can drop into your C or Java program (like "MPI_Send(message) or MPI_Receive(message)" etc.) at different places. The hard part is working out where exactly in your program these places are!

Infrastructure for the future

No man is an island. It's turning out that computers are much the same, in that through connecting them and using them collectively, we can further the boundaries of science, build new infrastructure, and generally enable our futures, as well as doing lots of cool stuff which we couldn't do before like make Toy Story movies. Ok, that's enough waxing philophical for me - but in seriousness there is barely a discipline in computer science, or even science in general, that couldn't benefit from a little distributed help. Just think of the WWW, a distributed system we wouldn't be able to live without now(!), and imagine the possibilities for others.

Handy Links

The TeraGrid project
SETI@home
Distributed.net
Redbrick, DCU Networking Society (they set up the Compapp Linux Cluster, also: shameless plug)
Compapp Linux Cluster (kind of sparse, but will be updated soon and just in case you're interested)