-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Fancy slicing with lists #26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
(and also support nested tasks)
|
Fixes #22 |
|
Also @shoyer, this brings us to the point where In [19]: np.array(d[[5, 3, 0]].sum(axis=0))
Out[19]: array([ 80, 83, 86, 89, 92, 95, 98, 101, 104])Which, I think, is likely sufficient for your common use cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you really want to explicitly restrict array indexing to lists?
Assuming numpy is a hard dep of dask (which I think it is?) I would rather cast to ndarray for non integer/slices and then allow only 1d arrays of integers. For large arrays, using lists is going to be a bottleneck.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do both. I was just at about my limit for complexity while I was building this and didn't want to think about other cases. Both of those sound good though.
|
Handling 1D boolean arrays is also pretty easy -- you can just convert them into integer arrays with np.nonzero. |
|
I've handled the edge cases (I think). Merging. This doesn't yet handle multi-list nor things like slicing with arrays. |
Set num_boost_round
OK, so we do a dual approach to achieve fancy indexing.
Given an index, like
We first do the normal
dask_slicesolution on the array with the slice replaced with an empty listWe then follow with the final list list. I suspect that we could repeat these for multiple lists and achieve Matlab style orthogonal indexing.
It mostly works
Example
The actual dask looks like the following
Some known problems
d[0]failscc @nevermindewe @shoyer