#1886 Should update openmpi in F11 prior to final
Closed: Invalid None Opened 14 years ago by dledford.

There is a file conflict between the openmpi-1.3.1 package in F11 and another package (496131). There are also several other bugs that should be fixed in openmpi prior to F11 release (474677, 496909, 496911, 499851). However, before embarking on resolving all these issues, there is a dependent package that openmpi builds against that also needs updating. However, updating that package has to be done in 4 steps:

libibcommon -> buildroot libibumad against new libibcommon -> buildroot libibmad against new libibumad -> buildroot opensm against new libibmad -> buildroot openmpi against new opensm

So, before attempting to undertake this task for F11, I'm posting here to get review/approval for the intended process. The opensm update is ready to go (it's been built from the devel branch for f12 already, as has openmpi). The only question is whether or not to resolve these issues prior to F11 or after F11 goes live. One final note to this matter is that the current openmpi version is 1.3.1, and when it gets updated to 1.3.2, there will be a binary ABI breakage. The openmpi 1.3.2 package has been released, but hadn't when openmpi was last updated. If we go this route (aka, update F11), we can go straight to 1.3.2 which is supposed to solve the binary breakage on upgrade issue once and for all. This will prevent a binary ABI break shortly after F11 is released.

The libibcommon package is already built, but not anything else just because of the chain of dependencies. If this is approved, then I would suggest just keeping this ticket open and I'll update it each time the next package is ready to be moved into the F11 buildroot.

Current libibcommon package n-v-r: libibcommon-1.2.0-1.fc11


I'm not touching this one with a 10 ft pole. The openmpi package is missing a proper -devel, and anything that depends on its libraries is going to be broken with this, due to how it uses subdirs without an ld.so.conf.

I made an attempt to fix this, in the absense of the Fedora maintainer, and all of my changes were reverted.

Adding a psuedo provides openmpi-devel in the package was on the list too, just not high enough to get a separate listing in the bugs. As for not working without an ld.so.conf, that's not correct. The environment modules module loads an LD_LIBRARY_PATH element that solves the issue (and is how we have more than one version on the system at the same time, and given the fact that the binary ABI has broken between 1.1 and 1.2 and 1.3 and 1.3.2, you can see why real users want that ability).

Environment modules only comes into play at runtime, not when a shared library needs to link to openmpi's shared library modules. Without the ld.so.conf in place, the shared library cannot find the openmpi symbols.

Perhaps I'm missing something obvious here, but are there more than one version of openmpi available in Fedora? If not, why are we targetting that case?

Environment modules comes into play any time openmpi is used. Whether building or running an application. You can't just run an mpi app, it must be started by the mpi runtime (which isn't available without the environment module being loaded). And you can't just link against mpi libraries, they are added to your app by mpicc (or one of the other compiler wrappers). Anyone attempting to make a generic shared library that links against mpi libraries and then is used by a subsequent application without knowing about the mpi stack is a non-starter.

As for more than one version, you can never have more than one version in a repo at a time, but you have all released versions archived via koji and have access to them, and you have the option to tell yum to just install new openmpi versions instead of upgrading. The support for multiple versions exists in the package, not in the infrastructure, so it does take some work to get multiple versions installed, but once done, it works just fine.

When shared libraries are available, the expectation is that other libraries can depend on them without performing any magic invocations. Lapack and blacs both expect to be able to use the MPI shared libraries like this. They never need to use the mpi runtime.

One possible workaround could be to make a subpackage just that provides the ld.so.conf, so that lapack/blacs can depend on it and function appropriately.

Actually, no. Both lapack and blas are not MPI using apps (one, they don't even include a BuildRequires: on any mpi stack, and two neither one have a single invocation to MPI_init() in their source code, and without it they will never use MPI directly).

Secondly, the MPI spec requires any app to call MPI_init() prior to using any MPI functions. MPI_init() would have to be a forking function with some rather tricky hand offs of file descriptors in order to allow an application linked against an MPI library to run without a runtime environment. Instead, every MPI or psuedo MPI implementation I know of, going all the way back to pvm and lam, have required a run time for their apps. The run time took care of A) determining which machines in the cluster to run on and how many to run on, B) determining which copies of the app were at what tier, C) starting a copy of the runtime on all the target machines, D) creating all the connections between the target machines/apps, and E) finally forking and starting the app with all the connections already up and going so that the MPI_init() merely needs to find out what tier it is and how many connections it has and then go from there. It is much easier to do this work before you fork off the target app than it is afterwards, hence why everyone does it via a run time and not by hooking MPI_init() and then trying to retrofit things onto an already running app. So I don't see any MPI stack working without their runtime any time soon.

You should look at blacs (not blas), as it definitely depends on an MPI libary. Scalapack (not lapack, sorry) depends on an MPI library as well. They may not be able to leverage MPI functionality without calling MPI_init(), but they don't need to use MPI functionality to run (but it does need to be compiled in for it to be a valid option).

A deeper investigation will have to wait until I get back from my appt, but blacs very adequately demonstrates the issue of why MPI stacks can't be treated the way you are talking about. It is hard linked against openmpi, and yet the changes that you refer to me reverting included ongoing usage of the alternatives system to select which MPI stack the system wishes to use. As you can well imaging, you can select Lam or mpich or something else as your mpi stack, and the user may have fully configured their mpi stack and have it up and running, but since blacs will only pick up the openmpi libs, and openmpi may not be configured at all, it will in fact fail to operate properly. In other words, the fact that blacs links against one and only one mpi library makes the whole usage of alternatives support, or environment modules support as I use, irrelevant.

And Scalapack is an mpi app, and is mpi aware, and expects to use the mpi runtime. To support more than one mpi stack under an mpi app like that, see what I did with the rhel mpitests package. You can not assume that an mpi app linked against one mpi stack will work against another, you have to include buildrequires for all the mpi stacks you want to support, and build against each stack, and then create packages specific to the mpi stack you built against.

So, I guess at this point we can take our pick: support more than one mpi stack in Fedora and live with the fact that some assumptions don't hold true in a multi-mpi stack environment, or declare fedora a one and only one mpi stack distribution and let those assumptions stand.

You can not assume that an mpi app linked against one mpi stack will work against another, you > have to include buildrequires for all the mpi stacks you want to support, and build against each > stack, and then create packages specific to the mpi stack you built against.

I'm not making that assumption. Up to recently, scalapack had BuildRequires: openmpi-devel (which no longer exists after you reverted my changes), and a detected Requires on openmpi's shared libraries. It is specific to the mpi stack it was built against.

The problem is that once installed, the scalapack shared libraries can't find the openmpi shared library symbols, and anything linked against scalapack will refuse to start.

Replying to [comment:9 spot]:

I'm not making that assumption. Up to recently, scalapack had BuildRequires: openmpi-devel (which no longer exists after you reverted my changes), and a detected Requires on openmpi's shared libraries. It is specific to the mpi stack it was built against.

After having had the time to sort a few things out via code inspection, I would just like to point these facts out:

1) scalapack has no direct dependency on MPI, only indirect via blacs
2) the only uses of MPI in scalapack are in the REDIST/TESTING directory, and in all those cases the test app is a proper MPI app
3) this is backed up by the mpiblacs_issues.ps file in blacs that points out that the user space app linked against it must call MPI_init for the library because the library can't do it itself
4) the situation used to be one where the application was linked against openmpi and ran fine, even if openmpi wasn't set up to be useful, and now the application must do something in order to use openmpi

So, from what I can tell, a user needs a working MPI stack to use the MPI aspect of scalapack/blacs. Just having the application find the library it links against isn't good enough. That being the case, the argument that it "just works" out of the box is incorrect. It runs, but it doesn't work until the user has a properly setup MPI environment. And it's not at all certain that every other user doesn't suffer a performance penalty for the checks to see if MPI is initialized and if so to map certain blacs functions to MPI variables.

To me, that means the real issue is that you want it to run out of the box, linked against an MPI stack that isn't necessarily setup, so that if the person happens to want to use MPI then they can.

However, I find it somewhat incongruous that we support multiple MPI stacks, yet we link only against this one MPI stack, and without it, our application doesn't run. That puts a bit of deception to the claim that we support multiple MPI stacks in that really we support one MPI stack out of the box (sorta), and all other MPI stacks will need to recompile certain libraries to work against those MPI stacks.

So, why special case openmpi? Why not just build blacs without MPI support and treat all the MPI stacks equally? If you want to run blacs on that MPI stack, compile against it. Or create subpackages that are different builds of blacs that do support a given MPI stack, so that a user could install scalapack/blacs to use them without MPI, scalapack-openmpi/blacs-openmpi to use them via the openmpi stack, etc. That would seem the most equal treatment of all. And it really shouldn't be that big a deal for users that want MPI support considering that real MPI support does not work out of the box and does require configuration. For those people, compiling blacs against their MPI stack or doing a yum install of the scalapack-openmpi package is just a simple step amongst others that they have to do already because things don't work out of the box.

Can we get some sort of decision on this so I can plan my time accordingly?

Irrespective of what we decide to do, it's too late and too large to land in the final tree; it could be a zero-day update.

Metadata Update from @dledford:
- Issue set to the milestone: Fedora 11 Final

7 years ago

Login to comment on this ticket.

Metadata