add support for RPATH linking (REVIEW)#1942
Conversation
…pare_rpath_wrapper (WIP)
|
OK, time for one more update... With the current state combined with easybuilders/easybuild-easyblocks#1031 (and easybuilders/easybuild-easyblocks#1030, but I think that's mostly irrelevant), The OpenFOAM build itself is still running... So, unless further testing reveals any serious problems, this is the version that is going to go in under @wpoely86 do you mind giving this implementation a code review? Once this is merged, I'll open up an issue to track any problems that pop up with the |
easybuild/framework/easyblock.py
Outdated
| dirpaths = [ | ||
| os.path.join(self.installdir, 'bin'), | ||
| os.path.join(self.installdir, 'lib'), | ||
| os.path.join(self.installdir, 'lib64'), |
There was a problem hiding this comment.
some software (think OpenFOAM) doesn't follow this. In principle, you need to search everywhere.
There was a problem hiding this comment.
I have reworked this so sanity_check_rpath (now a public method) accepts an argument rpath_dirs so we can specialise this in easyblocks (e.g. for OpenFOAM) as needed. We can also consider adding an easyconfig parameter like sanity_check_rpath_dirs in addition.
| # ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped | ||
| if "dynamically linked" in out: | ||
| # check whether all required libraries are found via 'ldd' | ||
| out, ec = run_cmd("ldd %s" % path, simple=False) |
There was a problem hiding this comment.
it's not because a library is found, that we are using the correct one. A library can be both installed in the system and by EB (zlib, ...)
There was a problem hiding this comment.
I don't think his intention was to check this very deeply. It is just a sanity check.
I agree that a good check is needed before this goes into production. It is pretty difficult though, since EB doesn't enforce linking only within EB.
You would need to have information about each dependency library to ensure those are linked against the correct EB version and ignore ones that are OS deps. I don't think this is reliable given current information in the easyconfigs.
There was a problem hiding this comment.
Sure, but at least the library is either found via RPATH or in a default system location.
This is the best we can do in terms of a sanity check for RPATH imho, there's no guarantee here indeed...
I'm open to suggestions on how to check this more strictly (while keeping it relatively lightweight).
There was a problem hiding this comment.
I would at the minimum print a warning about linkage to non EB paths. We can whitelist all glibc things but any other library needs to be looked after. It's not necessarily an error, but it needs to be checked.
There was a problem hiding this comment.
OK, but I won't block this PR for that, we can do that as a follow-up, since it'll take a while before we get the white-listed right probably.
Also, that's sort of orthogonal to --rpath anyway, we should always be doing that to catch unintended linking to system libraries (but then we also need to take into account --filter-deps, which further complicates things).
I opened an issue for this, see #1991
| @@ -0,0 +1,80 @@ | |||
| #!/usr/bin/env python | |||
There was a problem hiding this comment.
That doesn't work, you can only pass a single argument to a shebang.
And anyway, this doesn't matter much, we call the script with python rpath_args.py ... anyway (i.e. with the python that is used to run eb, i.e. sys.executable).
There was a problem hiding this comment.
env passes the arguments. env python -E will just work.
There was a problem hiding this comment.
Hmm, ok, seems like it does work.
But still, this is irrelevant, since we call this script with python rpath_args.py, bypassing the shebang entirely...
| return res | ||
|
|
||
|
|
||
| cmd = sys.argv[1] |
easybuild/scripts/rpath_args.py
Outdated
|
|
||
| # output: statement to define $RPATH | ||
| if lib_paths: | ||
| print "export RPATH='%s=%s'" % (rpath_flag, ':'.join(sorted(lib_paths))) |
| # add -rpath flags in front | ||
| cmd_args = cmd_args_rpath + cmd_args | ||
|
|
||
| if not version_mode: |
There was a problem hiding this comment.
what's the meaning of this version mode anyway?
There was a problem hiding this comment.
see above, commands like gcc -v should not get broken by injecting -rpath flags
|
|
||
| # logging function | ||
| function log { | ||
| echo "($$) [$(date "+%%Y-%%m-%%d %%H:%%M:%%S")] $1" >> %(rpath_wrapper_log)s |
There was a problem hiding this comment.
It does seem like it might be more obvious if the file had an extension other than .sh (.sh.in, or template or something). Is there not a saner template system for python used already by EB?
There was a problem hiding this comment.
@rjeschmi I like your suggestion to name it .sh.in, I'll apply that change.
| @@ -0,0 +1,26 @@ | |||
| #!/bin/bash | |||
There was a problem hiding this comment.
why both a bash and python wrapper?
There was a problem hiding this comment.
because iterating of the command line arguments in bash is damn nasty to implement imho, it can be done way cleaner in a Python script (which is easier to unit test too).
There was a problem hiding this comment.
so again, why both a bash and python wrapper? Use a purely python one? Forks are costly.
There was a problem hiding this comment.
I'm not sure that calling the original compiler/linker command is going to work when called from within a Python shell, I suspect that may lead to problems.
Total overhead of this bash+Python wrapper approach is 10-15%, which is very acceptable imho, you'll barely notice.
| return var_defs, hits | ||
|
|
||
|
|
||
| def find_eb_script(script_name): |
There was a problem hiding this comment.
Pfft, I don't think that's worth the effort. find_eb_script is called once for the bash wrapper script, once for rpath_args.py per installation (by toolchain.prepare()). Caching isn't going to make a significant difference here (even when multiple installations are being done in one session).
| """Return list of relevant compilers for this toolchain""" | ||
|
|
||
| if self.name == DUMMY_TOOLCHAIN_NAME: | ||
| c_comps = ['gcc', 'g++'] |
There was a problem hiding this comment.
I guess we could, but if we don't the RPATH sanity check should fail on builds done with dummy, and I haven't seen that happening.
We can do this when the need arises?
We could also (always!) wrap cc and c++ with a script that breaks the compilation, so we know when these are used. Because if they are, the compilers provided by a non-dummy toolchain won't be used...
There was a problem hiding this comment.
Also fine by me. Some projects use cc/c++ before gcc/g++
rjeschmi
left a comment
There was a problem hiding this comment.
will work on some testing...
| # ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped | ||
| if "dynamically linked" in out: | ||
| # check whether all required libraries are found via 'ldd' | ||
| out, ec = run_cmd("ldd %s" % path, simple=False) |
There was a problem hiding this comment.
I don't think his intention was to check this very deeply. It is just a sanity check.
I agree that a good check is needed before this goes into production. It is pretty difficult though, since EB doesn't enforce linking only within EB.
You would need to have information about each dependency library to ensure those are linked against the correct EB version and ignore ones that are OS deps. I don't think this is reliable given current information in the easyconfigs.
|
|
||
| # logging function | ||
| function log { | ||
| echo "($$) [$(date "+%%Y-%%m-%%d %%H:%%M:%%S")] $1" >> %(rpath_wrapper_log)s |
There was a problem hiding this comment.
It does seem like it might be more obvious if the file had an extension other than .sh (.sh.in, or template or something). Is there not a saner template system for python used already by EB?
| arg = args[idx] | ||
|
|
||
| # if command is run in 'version check' mode, make sure we don't include *any* -rpath arguments | ||
| if arg in ['-v', '-V', '--version', '-dumpversion']: |
There was a problem hiding this comment.
I think these options are related to flags that some build sent to ld or ld.gold and are being caught. The version mode below I think is to catch the bug related to rpath GCC that was noticed elsewhere.
…owing @rjeschmi's suggestion
|
@wpoely86 I consider this good to go now. This is certainly not 100% finished, but that's exactly why we've pushed this under It will allow people to easily provide feedback on what we have in place now though, since it will be included in the upcoming EasyBuild v3.0. Please submit an approved review, so this can be merged? |
easybuild/scripts/rpath_args.py
Outdated
| rpath_filter = rpath_filter.split(',') | ||
| else: | ||
| # don't RPATH system library locations by default | ||
| rpath_filter = ['/lib.*', '/usr/lib.*'] |
There was a problem hiding this comment.
Remember that on a multiarch system (Debian, Ubuntu), system libraries may also be located in /usr/lib/x86_64-linux-gnu (and i386-linux-gnu, and aarch64-linux-gnu, and armhf-linux-gnu, and ...).
There was a problem hiding this comment.
@geimer that's fine, those paths will match the /usr/lib* regex, so we don't inject -rpath flags for those locations, which is as intended
…rpath filter in options.py
…lter in toolchain.prepare
|
@wpoely86 now updated to included build dir and tmpdir in RPATH filter (passed down from |
easybuild/framework/easyblock.py
Outdated
|
|
||
| # list of paths to include in RPATH filter; | ||
| # only include builddir if we're not building in installation directory | ||
| if os.getenv('TMPDIR'): |
There was a problem hiding this comment.
shouldn't we better use tempfile.tempdir?
There was a problem hiding this comment.
I think you mean tempfile.gettempdir(), but sure, good point.
|
Hello, maybee it will looks a really dumb comment, but why don't include LIBRARY_PATH and LD_RUN_PATH export in modulefiles we use this kind of approach on our current module setup. those environment varaiables are natively used by the linker. loading boost and compiling versus this one provide binaries with accurate boost rpath NB there is a "corner case" using this approach, when some rpath instructions are set in upstream compilation mechanism. as ld in this case ignores LD_RUN_PATH and LD_LIBRARY_PATH (man ld explain it pretty well) edit: typo |
|
@EricDeveaud let's follow up on the |
(continuation of the work started in #1613 by @rjeschmi)
With
--rpathenabled, EasyBuild will generate abashwrapper script for each compiler and linker command (e.g.icc,icpc,ifort,ldandld.gold) during the toolchainpreparephase, where a couple of things are hardcoded into:rpath_args.py)rpath_args.pytakes care of determining which-rpathoption should be passed down to the original command$RPATH), based on the specified command line arguments for that command.It may also filter/tweak the specified command line arguments that will actually be passed down to the original command (
$CMD_ARGS).The
bashwrapper script will runrpath_args.py, evaluate the statements it prints out (which define$RPATHand$CMD_ARGS), and then call the original command using$RPATHand$CMD_ARGS).The location of these wrapper scripts is prepended to
$PATHto they get preference over the compiler/linker commands they wrap around.The sanity check was extended to check the output of
lddandreadelf -don all dynamically linked executables and libraries (that reside inbin,libandlib64subdirectories) in case--rpathwas used.TODO:
[ ] more testing
$LD_LIBRARY_PATHis defined by generated modules (TODO)[ ] fix the
FIXMEs inrpath_args.py[ ] end-to-end test for
--rpath[ ] documentation
Feedback welcome!
cc @geimer, @pescobar