[operator] Add Mish Activation Function#20320
Conversation
|
Hey @Adnios , Thanks for submitting the PR
CI supported jobs: [centos-gpu, unix-cpu, windows-gpu, miscellaneous, clang, website, windows-cpu, sanity, edge, centos-cpu, unix-gpu] Note: |
Signed-off-by: Adnios <[email protected]>
Signed-off-by: Adnios <[email protected]>
Signed-off-by: Adnios <[email protected]>
3a11899 to
47a52f3
Compare
szha
left a comment
There was a problem hiding this comment.
Thanks for the contribution! Going forward in 2.0, as we will mainly use np/npx instead of sym, could you also include a test for it with npx.activation?
Sure. Thanks for advice. |
Signed-off-by: Adnios <[email protected]>
Signed-off-by: Adnios <[email protected]>
Signed-off-by: Adnios <[email protected]>
Signed-off-by: Adnios <[email protected]>
Signed-off-by: Adnios <[email protected]>
Signed-off-by: Adnios <[email protected]>
Signed-off-by: Adnios <[email protected]>
|
@mxnet-bot run ci [centos-gpu, unix-gpu] |
|
Jenkins CI successfully triggered : [centos-gpu, unix-gpu] |
|
@szha Please help review |
|
@Adnios merged. Thank you for the contribution! |
| template <typename DType> | ||
| __device__ inline DType mish(const DType val) { | ||
| if (type_util::has_double_or_integral<DType>::value) { | ||
| return val * ::tanh(::log(1 + ::exp(val))); |
There was a problem hiding this comment.
One thing that could be improved here (I did not notice this PR earlier, sorry for a late feedback) is the numerical stability of the softrelu part - see the implementation of the softrelu (it switches to softrelu(x) = x for large values of x to avoid overflow). @Adnios could you open another PR changing e.g. this function to
return val * op::tanh(op::softrelu(val));
(the double vs float is handled in op::tanh and op::softrelu anyway so this one will also be simpler as a result) and similarly backward?
There was a problem hiding this comment.
Yes, agreed, usually Softplus has an upper bound of 20.
There was a problem hiding this comment.
Yes. Thanks for your advice.
Description
Add Mish Activation Function.
Related issus: #16841
The pr(#17696) seem to be dead.
Checklist
Essentials
Changes
Comments