Unifying post-quantization properties#20724
Conversation
|
Hey @DominikaJedynak , Thanks for submitting the PR
CI supported jobs: [windows-gpu, unix-cpu, sanity, centos-cpu, clang, unix-gpu, miscellaneous, website, windows-cpu, edge, centos-gpu] Note: |
|
@mxnet-bot run ci [centos-gpu, unix-cpu] |
|
Jenkins CI successfully triggered : [unix-cpu, centos-gpu] |
|
@mxnet-bot run ci [centos-gpu] |
|
Jenkins CI successfully triggered : [centos-gpu] |
|
@mxnet-bot run ci [unix-cpu] |
|
Jenkins CI successfully triggered : [unix-cpu] |
bgawrych
left a comment
There was a problem hiding this comment.
Overall LGTM - one comment below
| support_requantize_fusion_op_name.insert("_sg_onednn_conv"); | ||
| support_requantize_fusion_op_name.insert("_contrib_quantized_elemwise_add"); | ||
| support_requantize_fusion_op_name.insert("_contrib_quantized_npi_add"); | ||
| disable_fuse_all = dmlc::GetEnv("MXNET_DISABLE_ONEDNN_FUSE_ALL", false); |
There was a problem hiding this comment.
What about changing this flag to something less general? I mean this flag doesn't disable all fuses (only the ones in this file). So better would be something like "MXNET_DISABLE_ONEDNN_FUSE_REQUANTIZE" and "MXNET_DISABLE_ONEDNN_FUSE_DEQUANTIZE". We also can negate default value to call it "MXNET_ONEDNN_FUSE_REQUANTIZE" and "MXNET_ONEDNN_FUSE_DEQUANTIZE".
+@anko-intel
There was a problem hiding this comment.
Fully agree. "MXNET_ONEDNN_FUSE_REQUANTIZE" and "MXNET_ONEDNN_FUSE_DEQUANTIZE" are good proposals
| class SgDNNLPostQuantizeSelector : public SubgraphSelectorV2 { | ||
| public: | ||
| /*! \brief pattern match status */ | ||
| enum SelectStatus { |
There was a problem hiding this comment.
[Future consideration] It could be fine to replace plain enum with class enum - to avoid implicitly conversion to other types (like another enum or int)
| #include <string> | ||
| #include <vector> | ||
|
|
||
| #include "../../nn/dnnl/dnnl_convolution-inl.h" |
There was a problem hiding this comment.
[Future consideration] + @anko-intel
| #include "../../nn/dnnl/dnnl_convolution-inl.h" | |
| #include "./dnnl_convolution-inl.h" |
There was a problem hiding this comment.
This particular file is in nn/dnnl, not subgraph/dnnl, but I can change all paths in a way that avoids dots if you consider it more transparent
| class SgDNNLPostQuantizeSelector : public SubgraphSelectorV2 { | ||
| public: | ||
| /*! \brief pattern match status */ | ||
| enum SelectStatus { |
There was a problem hiding this comment.
[Future consideration] Why the enum (enum SelectStatus) is public?
There was a problem hiding this comment.
It was not part of my changes but you are right, it is better to make it private :)
| support_requantize_fusion_op_name.insert("_contrib_quantized_npi_add"); | ||
| explicit SgDNNLPostQuantizeSelector(const bool dis_fuse_all, const bool dis_float_output) | ||
| : disable_fuse_all(dis_fuse_all), disable_float_output(dis_float_output) { | ||
| support_requantize_fusion_op_name = support_req_fusion_op; |
There was a problem hiding this comment.
What are the pros and cons of using this assignment? Roughly, why the set is global and then it's inexplicably assigned?
There was a problem hiding this comment.
It is used in two different classes and is private in both, and I wanted to avoid duplicating. I will place this set inside anonymous namespace, as we discussed.
| if (n.outputs.size() > 1) { | ||
| // check if requantize have other outputs than dequantize | ||
| // if it has we can't fuse dequantize | ||
| for (auto kv : n.outputs) { |
There was a problem hiding this comment.
How about? If an object is const-> then kv is const, otherwise, the kv might be non-const.
| for (auto kv : n.outputs) { | |
| for (const auto kv : n.outputs) { |
There was a problem hiding this comment.
kv.first, which is the only place where kv is used, is marked const, but I can make kv const as well for clarity
| support_requantize_fusion_op_name.count(raw_node->op()->name)) { | ||
| status = kStart; | ||
| matched_list.clear(); | ||
| matched_list.push_back(&n); |
There was a problem hiding this comment.
If we clear the vector here, the capacity is the same as before; Then it might be better to use emplace_back(&n) instead of push_back(n) ~ I hope to avoid creating temp object before.
How about using?
| matched_list.push_back(&n); | |
| matched_list.emplace_back(&n); |
| namespace op { | ||
|
|
||
| class SgDNNLPostQuantizeSelector : public SubgraphSelector { | ||
| const std::set<std::string> support_req_fusion_op = {"_contrib_quantized_elemwise_add", |
There was a problem hiding this comment.
This global set is likely enabled in all TU. What is an advantage of using it in this place?
There was a problem hiding this comment.
Commented above
09a17c1 to
1041fed
Compare
|
@mxnet-bot run ci [unix-cpu, miscellaneous ] |
|
Jenkins CI successfully triggered : [unix-cpu, miscellaneous] |
|
@mxnet-bot run ci [unix-cpu] |
|
Jenkins CI successfully triggered : [unix-cpu] |
|
@mxnet-bot run ci [unix-cpu] |
|
Jenkins CI successfully triggered : [unix-cpu] |
Description
Unifying requantize/dequantize post-quantization fuses for different operators into single property.