Skip to content

fix hierarchical mns w.r.t. MPI installed with a dummy toolchain#986

Merged
boegel merged 12 commits intoeasybuilders:developfrom
boegel:fix_hierarchical_mns
Aug 26, 2014
Merged

fix hierarchical mns w.r.t. MPI installed with a dummy toolchain#986
boegel merged 12 commits intoeasybuilders:developfrom
boegel:fix_hierarchical_mns

Conversation

@boegel
Copy link
Member

@boegel boegel commented Jul 30, 2014

enhanced version of #983 by @kcgthb (which should have been opened targeting develop

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part still needs to be figured out...

I can't think of a way to determine the right path here, since we have no information at this point with which compiler(s) the MPI library will be used with a this point.

So, we need some kind of generic path in that case?

@kcgthb: thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you got access to the easyconfig (which we do through ec I think?), we can look at the deps to determine the compiler?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boegel: yeah, I was thinking maybe the module should end up in modules/all/Core, instead of a subdirectory of modules/all/MPI or modules/all/Compiler, since it hasn't be compiled with any of those.

But then for Intel MPI, for instance, it kinds of 'disconnects' it from the rest of the Intel toolchain, so I'm not really sure what's the best way to do this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pinged @rtmclay about this. At TACC, they treat Intel MPI as a compiler-dependent module.

In other words, they basically install Intel MPI with a particular (non-dummy) compiler toolchain (e.g. iccifort or GCC).

That's the best solution I can think of. Applying this across the board (for all existing toolchains involving impi) requires to create new impi easyconfig files though, which use non-dummy toolchains...

I don't think that a big issue, but it's painful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be a good solution.

But here is something I don't understand very well: since the intel toolchain now depends on GCC, if you decide to install impi with the intel toolchain, it means it will depend on both GCC and icc?

@boegel
Copy link
Member Author

boegel commented Aug 1, 2014

But here is something I don't understand very well: since the intel toolchain now depends on GCC,
if you decide to install impi with the intel toolchain, it means it will depend on both GCC and icc?

@kcgthb: GCC is listed as a dependency of icc and ifort. So indirectly, yes, GCC will be a dependency of impi when it is being installed with an iccifort toolchain.

@geimer (who worked intensively with me on the hierarchical module naming scheme support) pointed out that this also means that whenever icc (or impi) is loaded, the $MODULEPATH will also be extended with the Compiler/GCC/4.8.3 namespace for example, since the GCC module will be loaded as a dependency of icc (and ifort).

This is probably not what we want either.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kcgthb: I updated this to throw an appropriate error whenever an MPI library being installed with a dummy module is being run into, since that's simply not compatible with this hierarchical module naming scheme...

I'm working on a set of easyconfig files for a new version of the intel toolchain that uses a proper impi easyconfig, i.e. one that uses an iccifort toolchain rather than dummy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct set of easyconfig for an intel toolchain in the context of a hierarchical naming scheme are available in easybuilders/easybuild-easyconfigs#1014

@boegel
Copy link
Member Author

boegel commented Aug 1, 2014

A possible fix for the GCC module also extending the $MODULEPATH when it's being loaded as a dependency of icc/ifort is to use a special-purpose GCC build with a particular versionsuffix (can't think of a proper one right now), for which the easyconfig includes some specification that the generated GCC module should not extend $MODULEPATH...

@kcgthb
Copy link
Contributor

kcgthb commented Aug 1, 2014

@boegel: right, so having both icc and GCC loaded when one loads impi, and all their dependent modules appearing from the $MODULEPATH extension, doesn't look to appealing to me.

Having a special-purpose, suffixed version of GCC doesn't look too good either, since it would mean having to install two identical versions of GCC, one for users to load, the other as a behind-the-scenes, semi-hidden dependency for Intel compilers... I'm not a big fan of this.

So, what was the reasoning about making the intel toolchain depend on GCC, again? I didn't find any detailed explanation in the mailing-lists archives, nor here, but I may have missed it. Because the ictce toolchain worked fine without an explicit dependency on GCC.

I understand it would be nice to have it depend on a fixed version of GCC, so nothing gets modified when the OS gcc is updated (which doesn't happen that often), but that seems to have much more implications than what most people would expect to have to deal with. Especially in the hierarchical module naming scheme, which is probably what a lot of people will want to use.

@boegel
Copy link
Member Author

boegel commented Aug 13, 2014

I discussed this with @kcgthb outside of this issue a while ago, but for completeness sake...

I just outlined the reasons for using GCC as a dependency for the Intel compilers in the issue that was already open for this: #635 (see #635), so let's continue that discussion there.

It seems that, for lack of a better solution, a dedicated module for GCC that doesn't include $MODULEPATH extensions is the best solution possible. Currently, this does mean producing a another build of GCC that is potentially identical to an existing one, however.

This can be fixed however, by adding support for generating multiple different modules for a single software installation, potentially using different module naming schemes. Although this would still result in two separate modules being installed, it wouldn't duplicate the actual software installation. Using an advanced modules tool like Lmod might even allow to hide the 'crippled' GCC module from users, while still allowing it to be used as a dependency, i.e. not include it in the output of module avail, but making sure that a module load does work (@rtmclay: is this possible already? if not: feature request 😉).

@boegel
Copy link
Member Author

boegel commented Aug 13, 2014

Ah, hold on, we can simply add support for installing hidden modules, i.e. which have a name prefixed with .. I know that Lmod (and probably other module tools too) don't show these in module avail, but they are loadable.

@fgeorgatos
Copy link
Contributor

just for the record:

the ability to provide N-to-1 modules per build, may allow to solve an
array of other issues, such as:

  • permit hierarchical and non-hierarchical (flat) module views to coexist
    (without redundant builds)
  • allow distinct groups to have customized views of modulefiles (eg. use
    distinct license servers for the same build)
  • perhaps permit users to develop custom namespaces in user-space for
    builds provided in group or, system level!
  • why bother as sysadmins to make a choice between lower/upper case module
    names? give both and be done with it!!!

I have the feeling the above comment belongs to another open issue, yet do
not recall which one...

@rtmclay
Copy link

rtmclay commented Aug 13, 2014

@boegel is correct, you can use a module where the version number has a leading dot. This module will not be reported by avail or spider. It will be listed however if loaded.

The one other thing that I want to remind you is that the hidden module can't be named "GCC". With the one name rule. You can only load one module named GCC. You can call the hidden one GCC-helper or GCC-base or anything else just not GCC. Case matters so GCC and gcc are different as well.

@boegel
Copy link
Member Author

boegel commented Aug 21, 2014

support for hidden modules has been added via #1009

@rtmclay: I'm aware that a hidden module named GCC would cause problems when trying to load another GCC module, but this shouldn't be done anyway. If a hidden module (e.g. GCC/.4.8.2) is already loaded as a dependency for icc, I see no reason why someone should load another module providing GCC (without unloading icc first).

@JensTimmerman
Copy link

What exactly is the issue with loading GCC when the intel toolchain is loaded?
some version of gcc will be used, you'd rather be safe and show it.
I don't see how hiding it from module avail will fix anything. it will still be loaded.

So I guess the problem is that everything in that openmpi-xx-GCC etc (basically everything that's in gompi) will show up after you do

module load intel
module av

So you might want to hide everythign in 'subtoolchains' like gompi,
or use the old ictce toolchain instead of intel?

@boegel
Copy link
Member Author

boegel commented Aug 21, 2014

@JensTimmerman: the issue with loading a 'normal' GCC module as a dependency for intel is that modules built with GCC will also appear, next to the ones built with the Intel compilers. So, you may have two different OpenMPI/1.6.4 modules available: ones built with GCC, one with icc/ifort. That's bad.

Therefore, we "cripple" the GCC module we use as a dependency for icc/ifort by specifying that it should not include any extensions to $MODULEPATH by setting include_modpath_extensions to False (see easybuilders/easybuild-easyconfigs#1014). To differentiate between a crippled GCC module and a normal one, we use a versionsuffix -libs for the former.

However, that crippled GCC module will still show up in the output of module avail GCC, so users would be tempted to load it, and expect it to extend $MODULEPATH and make additional modules become available, which it won't. So, installing it as a hidden module seems like a good solution: it will be hidden in avail, but can still be loaded as a dependency for icc/ifort.

@boegel
Copy link
Member Author

boegel commented Aug 21, 2014

@JensTimmerman: btw, with reviewing I meant to review this actual PR, not further discuss the usefulness of hidden modules, etc. That's barely relevant in the context of this PR, so let's move that to easybuilders/easybuild-easyconfigs#1014 or #1009.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

constants instead of magic strings?

@stdweird
Copy link
Contributor

@boegel no unittests? (i have no clue if it is too difficult to test though)

@boegel
Copy link
Member Author

boegel commented Aug 26, 2014

@stdweird: remarks fixed, unit tests enhanced, I'll merge this in if Jenkins is happy with it

boegel added a commit that referenced this pull request Aug 26, 2014
fix hierarchical mns w.r.t. MPI installed with a dummy toolchain
@boegel boegel merged commit 1e21bc2 into easybuilders:develop Aug 26, 2014
@boegel boegel deleted the fix_hierarchical_mns branch August 26, 2014 20:20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is already part of the log message. no need to repeat it.

@stdweird
Copy link
Contributor

@boegel please address the remarks. the fact that you use different logic to determine the compdir then the rest of the code is probably a bug

@boegel
Copy link
Member Author

boegel commented Sep 1, 2014

@stdweird: not a bug, see reply to your remark

@stdweird
Copy link
Contributor

stdweird commented Sep 1, 2014

@boegel still probably a bug 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants