[FEATURE] Integrate oneDNN binary primitive support for forward add, subtract, multiply, divide.#20713

agrabows · 2021-10-28T14:51:03Z

Description

Binary broadcast operators such as add, subtract, multiply, divide are implemented in both NDArray and NumPy modules and no oneDNN support exists for those operators. Goal of this task was to dispatch execution of those operators to oneDNN binary primitive.

Checklist

Essentials

PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage
Code is well-documented

Changes

Merge execution of add, subtract, multiply, divide operators to one NNVM_REGISTER_OP() function
Implement oneDNN dispatch for binary broadcast operators

Comments

Speedup for all cases noticed, up to ~350%.

mxnet-bot · 2021-10-28T14:51:11Z

Hey @agrabows , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

To trigger all jobs: @mxnet-bot run ci [all]
To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [centos-gpu, sanity, centos-cpu, windows-gpu, edge, website, miscellaneous, unix-cpu, clang, unix-gpu, windows-cpu]

Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

…Forward condition.

agrabows · 2022-01-10T11:38:24Z

@mxnet-bot run ci [unix-cpu]

mxnet-bot · 2022-01-10T11:38:27Z

Jenkins CI successfully triggered : [unix-cpu]

agrabows · 2022-01-10T12:51:07Z

@mxnet-bot run ci [windows-gpu]

mxnet-bot · 2022-01-10T12:51:11Z

Jenkins CI successfully triggered : [windows-gpu]

agrabows · 2022-01-11T08:52:32Z

@mxnet-bot run ci [windows-gpu]

mxnet-bot · 2022-01-11T08:52:38Z

Jenkins CI successfully triggered : [windows-gpu]

agrabows · 2022-01-11T14:18:50Z

@mxnet-bot run ci [windows-gpu]

mxnet-bot · 2022-01-11T14:18:54Z

Jenkins CI successfully triggered : [windows-gpu]

anko-intel · 2022-01-12T12:25:17Z

src/operator/nn/dnnl/dnnl_binary-inl.h

+  static MX_THREAD_LOCAL binary_op_fwd_map fwds;
+#endif
+  OpSignature key;
+  key.AddSign(static_cast<int>(alg));


what about "attrs" in the key ?
I think we probably should add attrs to key or remove it form DNNLBinaryOpFwd constructor parameters

+1 to remove attrs

attrs removed where it was possible

anko-intel · 2022-01-12T12:42:05Z

src/operator/tensor/elemwise_binary_broadcast_op.h

  bool dispatched     = false;
  if (!dispatched && common::ContainsOnlyStorage(*in_attrs, kDefaultStorage)) {
+#if MXNET_USE_ONEDNN == 1
+    if (dev_mask == mshadow::cpu::kDevMask)


disabling oneDNN in runtime is still working ?

anko-intel · 2022-01-12T13:11:54Z

tests/python/unittest/test_operator.py

+        [[8, 1, 6, 1], [7, 1, 5]], [[5, 4], [1]],
+        [[256, 256, 3], [3]], [[5, 4], [4]],
+        [[15, 3, 5], [3, 5]], [[15, 3, 5], [1, 5]],
+        [[15, 3, 5], [3, 1]]])


tests/python/unittest/test_numpy_op.py::test_np_binary_funcs

anko-intel · 2022-01-12T13:31:45Z

src/operator/numpy/np_elemwise_broadcast_op_sub.cc

                                                    op::mshadow_op::mixed_minus,
                                                    op::mshadow_op::mixed_rminus>)
+#if MXNET_USE_ONEDNN == 1
+    .set_attr<FComputeEx>("FComputeEx<cpu>", NumpyBinaryOperatorComputeExCPU<op::mshadow_op::minus>)


What about mixed version? is it work properly for GPU if oneDNN is enabled (default configuration). Could you check if there is any test for it ?

OneDNN dispatch is only taken under consideration after dev_mask == mshadow::cpu::kDevMask condition is met, thus not affecting GPU workflow.

bgawrych · 2022-01-12T14:47:43Z

src/operator/nn/dnnl/dnnl_binary-inl.h

+namespace mxnet {
+namespace op {
+
+using binary_op_fwd_t    = dnnl::binary;


binary_op_fwd_t => binary_fwd_t ?

bgawrych · 2022-01-12T14:48:47Z

src/operator/nn/dnnl/dnnl_binary-inl.h

+  static MX_THREAD_LOCAL binary_op_fwd_map fwds;
+#endif
+  OpSignature key;
+  key.AddSign(static_cast<int>(alg));


+1 to remove attrs

bgawrych · 2022-01-12T14:54:12Z

src/operator/nn/dnnl/dnnl_binary.cc

+  auto engine           = mxnet::CpuEngine::Get()->get_engine();
+  auto src0             = inputs[0].GetDNNLData();
+  auto src1             = inputs[1].GetDNNLData();
+  dnnl_output_t out_mem = CreateDNNLMem(outputs[0], fwd_pd->dst_desc(), req[0], &inputs[0]);


either inputs[0] or inputs[1] can be inplace - maybe it is worth checking which input is used as output when inplace

bgawrych · 2022-01-12T15:18:22Z

tests/python/unittest/test_operator.py

+        [[8, 1, 6, 1], [7, 1, 5]], [[5, 4], [1]],
+        [[256, 256, 3], [3]], [[5, 4], [4]],
+        [[15, 3, 5], [3, 5]], [[15, 3, 5], [1, 5]],
+        [[15, 3, 5], [3, 1]]])


please check if it works when rhs shape is longer than lhs e.g. [15,3] and [4, 15, 3]

…s, rename pointers

agrabows · 2022-01-13T07:42:46Z

@mxnet-bot run ci [windows-gpu]

mxnet-bot · 2022-01-13T07:42:53Z

Jenkins CI successfully triggered : [windows-gpu]

agrabows · 2022-01-13T13:11:10Z

@mxnet-bot run ci [windows-gpu]

mxnet-bot · 2022-01-13T13:11:15Z

Jenkins CI successfully triggered : [windows-gpu]

agrabows · 2022-01-14T09:04:00Z

@mxnet-bot run ci [windows-gpu]

mxnet-bot · 2022-01-14T09:04:08Z

Jenkins CI successfully triggered : [windows-gpu]

agrabows · 2022-01-17T08:27:06Z

@mxnet-bot run ci [windows-gpu]

mxnet-bot · 2022-01-17T08:27:13Z

Jenkins CI successfully triggered : [windows-gpu]

szha

LGTM. Leaving it open for a bit for other reviewers to take a look at the revisions.

anko-intel · 2022-01-18T10:36:04Z

src/operator/nn/dnnl/dnnl_binary.cc

+  auto ndim_1 = inputs[1].shape().ndim();
+  return ndim_0 >= 1 && ndim_0 <= 6 && ndim_1 >= 1 && ndim_1 <= 6 &&
+         inputs[0].shape().Size() != 0 && inputs[1].shape().Size() != 0 &&
+         dtype == mshadow::kFloat32 && dtype == inputs[1].dtype();


please check if oneDNN supports bfloat, if yes please create separate PR for it.

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Oct 28, 2021

agrabows force-pushed the master_1dnn_binary branch from 3cec78d to 34ff9eb Compare November 16, 2021 09:35

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Nov 16, 2021

agrabows force-pushed the master_1dnn_binary branch from 34ff9eb to 1e757b1 Compare November 16, 2021 10:36

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Nov 16, 2021

agrabows force-pushed the master_1dnn_binary branch from 1e757b1 to 5c961f0 Compare November 16, 2021 10:58

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Nov 16, 2021

agrabows force-pushed the master_1dnn_binary branch from 5c961f0 to 64e929d Compare November 25, 2021 17:23

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test and removed pr-work-in-progress PR is still work in progress labels Nov 25, 2021

agrabows force-pushed the master_1dnn_binary branch from 64e929d to 5c8bbf6 Compare November 25, 2021 17:27

mseth10 added pr-work-in-progress PR is still work in progress and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Nov 25, 2021

agrabows force-pushed the master_1dnn_binary branch from 5c8bbf6 to 32c8c25 Compare November 25, 2021 18:14

mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Nov 25, 2021

agrabow added 5 commits January 10, 2022 08:42

Integrate oneDNN support for binary elementwise operators.

44568a0

Delete template xpu for BinaryOperatorComputeExCPU function

5acdf4f

Fix binary operators StorageType functio.

ea64b1f

Fix SupportDNNLBinary function.

27f157d

Fix test_operator, DNNLAlgorithm structure, DNNLData and DNNLBinaryOp…

fc6cc12

…Forward condition.

anko-intel reviewed Jan 12, 2022

View reviewed changes

bgawrych reviewed Jan 12, 2022

View reviewed changes

agrabow added 2 commits January 12, 2022 18:03

Fix test cases, add oneDNN runtime flag to dispatch, remove node attr…

f2f52fe

…s, rename pointers

Fix sanity

3d7100a

szha approved these changes Jan 18, 2022

View reviewed changes

anko-intel approved these changes Jan 18, 2022

View reviewed changes

anko-intel reviewed Jan 18, 2022

View reviewed changes

bgawrych approved these changes Jan 18, 2022

View reviewed changes

Conversation

agrabows commented Oct 28, 2021

Description

Checklist

Essentials

Changes

Comments

Uh oh!

mxnet-bot commented Oct 28, 2021

Uh oh!

agrabows commented Jan 10, 2022

Uh oh!

mxnet-bot commented Jan 10, 2022

Uh oh!

agrabows commented Jan 10, 2022

Uh oh!

mxnet-bot commented Jan 10, 2022

Uh oh!

agrabows commented Jan 11, 2022

Uh oh!

mxnet-bot commented Jan 11, 2022

Uh oh!

agrabows commented Jan 11, 2022

Uh oh!

mxnet-bot commented Jan 11, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

agrabows commented Jan 13, 2022

Uh oh!

mxnet-bot commented Jan 13, 2022

Uh oh!

agrabows commented Jan 13, 2022

Uh oh!

mxnet-bot commented Jan 13, 2022

Uh oh!

agrabows commented Jan 14, 2022

Uh oh!

mxnet-bot commented Jan 14, 2022

Uh oh!

agrabows commented Jan 17, 2022

Uh oh!

mxnet-bot commented Jan 17, 2022

Uh oh!

szha left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants