[FEATURE] Enable dynamic linking with MKL and compiler based OpenMP#20474

akarbown · 2021-07-30T11:16:49Z

OneMKL 2021.3 fixed linking OpenMP while using SDL and MKL_THREADING_LAYER set to GNU.

Description

OneMKL 2021.3 fixes the issue described here. Thus, it enables linking with MKL dynamic libraries without having multiple OneMPs in a single process. It is possible due to linking MxNET with oneMKL Single Dynamic Library (SDL) and then setting the appropriate threading layer at run time in a function mkl_threading_layer() (or through environment variable MKL_THREADING_LAYER).

Connected with: [#19610], [#18255] and [#17794].

Changes

Add oneMKL 2021.3 to ubuntu docker images.
Enable MKL SDL (MKL_USE_SINGLE_DYNAMIC_LIBRARY) as the default linking when MKL version is grower than 2021.2 and static linking is turned off. (Bug no: MKLD-11109, OneMKL release notes) .
Otherwise, MKL static libraries are taken into account and used to build MxNET library.
Add support of the new oneMKL file structure in the FindBLAS.cmake file (fix comes from the cmake 3.20: #6210 ).

Comments

Does using oneMKL 2021.3 as the recommended one should be mentioned in the documentation?

mxnet-bot · 2021-07-30T11:16:54Z

Hey @akarbown , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

To trigger all jobs: @mxnet-bot run ci [all]
To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [edge, sanity, miscellaneous, centos-cpu, windows-cpu, unix-cpu, unix-gpu, clang, website, centos-gpu, windows-gpu]

Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

leezu · 2021-07-30T17:19:29Z

src/initialize.cc

+  #if defined( __INTEL_LLVM_COMPILER)
+    mkl_set_threading_layer(MKL_THREADING_INTEL);
+  #else
+    mkl_set_threading_layer(MKL_THREADING_GNU);


Does this work with Windows? Intel developer's reference states "for GNU threading on Linux* operating system only" https://software.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/support-functions/single-dynamic-library-control/mkl-set-threading-layer.html

Good point! I forgot about Windows! I'll exclude it in a moment.

akarbown · 2021-08-02T13:23:57Z

@mxnet-bot run ci[sanity]

mxnet-bot · 2021-08-02T13:24:04Z

Jenkins CI successfully triggered : [sanity]

akarbown · 2021-08-02T14:02:41Z

@mxnet-bot run ci [all]

akarbown · 2021-08-02T14:44:38Z

@mxnet-bot run ci[all]

akarbown · 2021-08-02T15:05:13Z

@leezu, could you help me with rerunning 'sanity' check in this PR? I've checked it locally and I don't see any issues. I suppose that this sanity check failed because of the timeout (SIGTERM). Is it possible?

leezu · 2021-08-02T20:54:58Z

@josephevans can we extend the max time for sanity? This PR triggers rebuild of the Docker used for Sanity, and apparently thus timeouts

akarbown · 2021-08-03T09:16:42Z

@mxnet-bot run ci [unix-cpu]

akarbown · 2021-08-05T20:22:42Z

@mxnet-bot run ci[website]

mxnet-bot · 2021-08-05T20:22:49Z

Jenkins CI successfully triggered : [website]

akarbown · 2021-08-05T21:07:34Z

@mxnet-bot run ci[clang, miscellaneous, unix-cpu, unix-gpu]

mxnet-bot · 2021-08-05T21:07:41Z

Jenkins CI successfully triggered : [unix-gpu, clang, miscellaneous, unix-cpu]

akarbown · 2021-08-06T06:10:37Z

@mxnet-bot run ci[unix-cpu]

mxnet-bot · 2021-08-06T06:10:43Z

Jenkins CI successfully triggered : [unix-cpu]

akarbown · 2021-08-06T08:20:15Z

@mxnet-bot run ci[unix-gpu, miscellaneous]

mxnet-bot · 2021-08-06T08:20:21Z

Jenkins CI successfully triggered : [unix-gpu, miscellaneous]

akarbown · 2021-08-06T11:07:43Z

@mxnet-bot run ci[miscellaneous]

mxnet-bot · 2021-08-06T11:07:48Z

Jenkins CI successfully triggered : [miscellaneous]

akarbown · 2021-08-06T13:41:47Z

@mxnet-bot run ci[miscellaneous]

mxnet-bot · 2021-08-06T13:41:51Z

Jenkins CI successfully triggered : [miscellaneous]

akarbown · 2021-08-09T09:01:14Z

@mxnet-bot run ci[miscellaneous]

mxnet-bot · 2021-08-09T09:01:21Z

Jenkins CI successfully triggered : [miscellaneous]

akarbown · 2021-08-09T15:59:54Z

@mxnet-bot run ci[unix-cpu, unix-gpu]

mxnet-bot · 2021-08-09T16:00:01Z

Jenkins CI successfully triggered : [unix-cpu, unix-gpu]

akarbown · 2021-08-09T18:22:01Z

@mxnet-bot run ci[unix-cpu]

mxnet-bot · 2021-08-09T18:22:05Z

Jenkins CI successfully triggered : [unix-cpu]

This is a temporary change to check if adding MKL runtime support won't crash MacOS.

Turn off SDL for MKL on MacOS as it need fixes.

Add proper mkl_threading flags for Mac Os. Enable all tests that are for MacOS + MKL tests. Rebuild numpy with MKL BLAS (instead of OpenBLAS).

support.

akarbown · 2021-10-13T06:13:04Z

@mxnet-bot run ci[unix-cpu, unix-gpu, centos-gpu]

mxnet-bot · 2021-10-13T06:13:11Z

Jenkins CI successfully triggered : [unix-cpu, unix-gpu, centos-gpu]

akarbown · 2021-10-13T11:20:44Z

@mxnet-bot run ci[centos-gpu]

mxnet-bot · 2021-10-13T11:20:47Z

Jenkins CI successfully triggered : [centos-gpu]

mozga-intel

LGTM! Thanks!

leezu · 2021-10-13T15:26:03Z

Thank you @akarbown!

mseth10 added the pr-work-in-progress PR is still work in progress label Jul 30, 2021

akarbown changed the title ~~[FEATURE] Enables dynamic linking with MKL and compiler based OpenMP~~ [FEATURE] Enable dynamic linking with MKL and compiler based OpenMP Jul 30, 2021

leezu reviewed Jul 30, 2021

View reviewed changes

akarbown force-pushed the compiler-based-openmp2 branch from 56c1404 to b3c66c6 Compare July 30, 2021 19:56

akarbown added 10 commits October 12, 2021 22:03

Cleaning MKL find_path cmake directories

6f628a7

[WIP] Adding github runner for MAC OS to check MKL specific changes

ae6686b

This is a temporary change to check if adding MKL runtime support won't crash MacOS.

clang format + mkl workflow rename

59db10d

Fixing some formatting + installing patchelf

32aa4cf

setting up Mac OS rpath for MKL libraries

dde35e4

Run only mkl tests

4b80278

Fix for finding MKL libraries on MacOs by FindBLAS.cmake

30f90b5

Turn off SDL for MKL on MacOS as it need fixes.

Enable linking MxNET with MKL static libraries on MacOS

5195792

Add proper mkl_threading flags for Mac Os. Enable all tests that are for MacOS + MKL tests. Rebuild numpy with MKL BLAS (instead of OpenBLAS).

Excluding MKL bf16 tests as CI MacOs machines seems not to have avx512

ca1bcbe

support.

Remove forcing rpath and some unnecessary comments

3cd9340

akarbown force-pushed the compiler-based-openmp2 branch from 1041620 to 3cd9340 Compare October 12, 2021 20:10

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Oct 12, 2021

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Oct 13, 2021

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-awaiting-review PR is waiting for code review and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Oct 13, 2021

mozga-intel approved these changes Oct 13, 2021

View reviewed changes

leezu merged commit abd293f into apache:master Oct 13, 2021

Conversation

akarbown commented Jul 30, 2021

Description

Changes

Comments

Uh oh!

mxnet-bot commented Jul 30, 2021

Uh oh!

leezu Jul 30, 2021

Choose a reason for hiding this comment

Uh oh!

akarbown Jul 30, 2021

Choose a reason for hiding this comment

Uh oh!

akarbown commented Aug 2, 2021

Uh oh!

mxnet-bot commented Aug 2, 2021

Uh oh!

akarbown commented Aug 2, 2021

Uh oh!

akarbown commented Aug 2, 2021

Uh oh!

akarbown commented Aug 2, 2021

Uh oh!

leezu commented Aug 2, 2021

Uh oh!

akarbown commented Aug 3, 2021

Uh oh!

akarbown commented Aug 5, 2021

Uh oh!

mxnet-bot commented Aug 5, 2021

Uh oh!

akarbown commented Aug 5, 2021

Uh oh!

mxnet-bot commented Aug 5, 2021

Uh oh!

akarbown commented Aug 6, 2021

Uh oh!

mxnet-bot commented Aug 6, 2021

Uh oh!

akarbown commented Aug 6, 2021

Uh oh!

mxnet-bot commented Aug 6, 2021

Uh oh!

akarbown commented Aug 6, 2021

Uh oh!

mxnet-bot commented Aug 6, 2021

Uh oh!

akarbown commented Aug 6, 2021

Uh oh!

mxnet-bot commented Aug 6, 2021

Uh oh!

akarbown commented Aug 9, 2021

Uh oh!

mxnet-bot commented Aug 9, 2021

Uh oh!

akarbown commented Aug 9, 2021

Uh oh!

mxnet-bot commented Aug 9, 2021

Uh oh!

akarbown commented Aug 9, 2021

Uh oh!

mxnet-bot commented Aug 9, 2021

Uh oh!

akarbown commented Oct 13, 2021

Uh oh!

mxnet-bot commented Oct 13, 2021

Uh oh!

akarbown commented Oct 13, 2021

Uh oh!

mxnet-bot commented Oct 13, 2021

Uh oh!

mozga-intel left a comment

Choose a reason for hiding this comment

Uh oh!

leezu commented Oct 13, 2021

Uh oh!

Reviewers

Assignees

Labels