In reviewing some of the issues associated with getting our imaginary IEED company set up, I stumbled across a real need - and therefore a real market opportunity.
It's this: most serious science researchers now have Unix desktops - mostly Linux with some using a BSD, MacOS X, or Solaris - along with some access to one or more of the national computing resource centers. What they need is a simple tool making whatever they do on their desktop efficiently runnable on any one of those compute centers.
To a user such a tool would come with a webmin style interface allowing the user to automate or click through the steps needed to move his files to the target machine; compile the application; connect to, transfer, and/or pre-process the data; run the thing; and, drag back some specified output.
Notice that each of the national centers has something like this now - but the user hook here would be real portability coupled with ease of use and some serious, but non intrusive, optimization for the target environment.
The first two are relatively easy to deliver: not only has much of the ground work has been done by others but creating and testing the interfaces for data movement, compilation, and run-time submission for a variety of third party compute resources really amounts to little more than (a lot of) grunt work.
The data connection effort is more complex but the combination of well known interfaces with the non applicability of SQL to larger data sets should make it relatively easy, given data owner co-operation, to develop collection specific plugins for that part of the job.
The last one, invisible code optimization, is much more difficult - and to my knowledge not attempted by any of the existing transfer agents. This, however, is where the real value is to both the user and the industry because the Linux kernel and its compilers are so heavily x86 centric that good Linux desktop code does not generally scale well even to Lintel grids and will generally perform very poorly on non Lintel machines - regardless of whether the user properly embeds the openMP stuff or not.
Someone interested in developing this service could therefore build on an existing system to deliver the core functionality quickly, and then take the time needed to work his way down the list of super computer centers to add target environment optimization for each one -and, of course, the thing is perfect for the F/OSS process because whoever puts together the first credible example could expect to recruit people from each of those centers to work on the data access and optimization problems.
Notice that part of what's going on here is a bet on the obvious: everybody's assembling Linux/x86 clusters now, but the people hatching budget proposals for 2008/9 and later are almost unanimously thinking about the vast performance gains possible with IBM's cell - but worried about exactly the portability problem this idea addresses.
Lifting that burden will, I think, be someone's route to success - and a significant contribution to advanced computing.