Skip to content

Cast: should get the round result for decimal to a decimal with smaller scale#3139

Merged
tustvold merged 2 commits into
apache:masterfrom
liukun4515:decimal_round_#3137
Nov 25, 2022
Merged

Cast: should get the round result for decimal to a decimal with smaller scale#3139
tustvold merged 2 commits into
apache:masterfrom
liukun4515:decimal_round_#3137

Conversation

@liukun4515

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Closes #3137

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

@liukun4515 liukun4515 requested a review from viirya November 19, 2022 07:58
@github-actions github-actions Bot added the arrow Changes to the arrow crate label Nov 19, 2022
@liukun4515 liukun4515 requested a review from alamb November 19, 2022 07:59
@liukun4515

liukun4515 commented Nov 19, 2022

Copy link
Copy Markdown
Contributor Author

Now it just implement the case of decimal128 to decimal128.
If the method of implementation looks good to all, I will fill out other case and add more test cases

cc @viirya @tustvold

@tustvold tustvold left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should consistently use wrapping or checked add, neg, div, rem, etc... This not only is consistent with other kernels, but avoids differences between release and debug builds

Comment thread arrow-cast/src/cast.rs Outdated
Comment on lines 1968 to 1969

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let d = v / div;
let r = v % div;
let d = v.wrapping_div(div);
let r = v.wrapping_rem(div);

Comment thread arrow-cast/src/cast.rs Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
d + 1
d.wrapping_add(1)

Comment thread arrow-cast/src/cast.rs Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
d - 1
d.wrapping_sub(1)

Comment thread arrow-cast/src/cast.rs Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let neg_half = half.neg();
let neg_half = half.wrapping_neg();

As we've divided by 2 this can't overflow

Comment thread arrow-cast/src/cast.rs Outdated
Comment on lines 2010 to 2011

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// TODO: it's better to implement the neg
let neg_half = half * i256::from_i128(-1);
let neg_half = half.wrapping_neg();

@liukun4515

liukun4515 commented Nov 22, 2022

Copy link
Copy Markdown
Contributor Author

I think this should consistently use wrapping or checked add, neg, div, rem, etc... This not only is consistent with other kernels, but avoids differences between release and debug builds

The changes i have done will not overflow.
It's good to make consistent between debug and release

@liukun4515 liukun4515 requested a review from tustvold November 23, 2022 13:38
@tustvold

Copy link
Copy Markdown
Contributor

Do you intend to switch to explicitly using wrapping / checked operations to ensure consistent behaviour across debug and release, and to be consistent with the other kernels?

@liukun4515

Copy link
Copy Markdown
Contributor Author

Do you intend to switch to explicitly using wrapping / checked operations to ensure consistent behaviour across debug and release, and to be consistent with the other kernels?

@tustvold

Sorry for the late reply, i forgot to push the changes.

@tustvold tustvold left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a logical conflict with one of the tests for negative scales

@tustvold tustvold merged commit 187bf61 into apache:master Nov 25, 2022
@ursabot

ursabot commented Nov 25, 2022

Copy link
Copy Markdown

Benchmark runs are scheduled for baseline = 2c86895 and contender = 187bf61. 187bf61 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@liukun4515

Copy link
Copy Markdown
Contributor Author

I think there is a logical conflict with one of the tests for negative scales

hi @tustvold Can you give an example to explain the conflict.

From #3152, I know negative scala is supported in the Arrow.
Before this, I have not known the usage of negative scale.

@liukun4515 liukun4515 deleted the decimal_round_#3137 branch November 26, 2022 00:52
@liukun4515

liukun4515 commented Nov 26, 2022

Copy link
Copy Markdown
Contributor Author

Maybe I got your thought from this commit 2abbf89

But i need time to get the behavior of negative scale when we do cast in other system.

@liukun4515

Copy link
Copy Markdown
Contributor Author

the decimal(10,-1) with the 128-bit integer (123), the string of the value is 1230, if we cast it to the decimal(10,-2), what the 128-bit integer of result should be? @tustvold @viirya

@tustvold

Copy link
Copy Markdown
Contributor

123

@liukun4515

Copy link
Copy Markdown
Contributor Author

123

I am confused about this, if the data type is decimal(10,-2) and the 128-bit integer is 123, it represent the value of 12300, and the value has been changed after casting.

I think the 128-bit integer should be 12 after casted to decimal(10,-2).

From the doc: https://arrow.apache.org/docs/python/generated/pyarrow.decimal128.html#pyarrow-decimal128

decimal128(5, -3) can exactly represent the number 12345000 (encoded internally as the 128-bit integer 12345), but neither 123450000 nor 1234500.

@tustvold

Copy link
Copy Markdown
Contributor

Apologies I misread your example, if the integer value was 1230 casting would yield an integer value of 123, with the same string value. Casting an integer value of 123 with a corresponding string value of 1230 I would expect to result in an error, although #3203 would suggest something isn't quite right here yet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Should be the rounding vs truncation when cast decimal to smaller scale

4 participants