[Python.NET] Efficient copy of .NET Array to ctypes or numpy array.

Discussion:

Dave Cook

2014-05-21 07:21:01 UTC

I need to copy a .NET Array (e.g. Double[] or Byte[]) to a numpy array, but
it seems the only way to do so is element by element, which is very slow.
Since we are copying a lot of data in real time, it creates a real
bottleneck.

Alternatively, efficient conversion of the .NET array to a Python style
byte string would allow numpy.fromstring() to be used for creating the
numpy array.

(I see a similar question went unanswered on the list in August 2011, but I
was hoping someone may have figured it out by now.)

Thanks,
Dave Cook

Brad Friedman

2014-05-21 14:32:23 UTC

Permalink

An aside that may be useful:

.net will skip array bounds checking within simple for-loops, as an optimization. But only if the binaries have all their optimizations turned on. A binary built for debug has them turned off. There is a huge speed up for iterating over an array when these optimizations are used. So make sure you are not looking at a compiler optimization configuration problem.

I need to copy a .NET Array (e.g. Double[] or Byte[]) to a numpy array, but it seems the only way to do so is element by element, which is very slow. Since we are copying a lot of data in real time, it creates a real bottleneck.
Alternatively, efficient conversion of the .NET array to a Python style byte string would allow numpy.fromstring() to be used for creating the numpy array.
(I see a similar question went unanswered on the list in August 2011, but I was hoping someone may have figured it out by now.)
Thanks,
Dave Cook
_________________________________________________
https://mail.python.org/mailman/listinfo/pythondotnet

_________________________________________________
Python.NET mailing list - PythonDotNet-+ZN9ApsXKcEdnm+***@public.gmane.org
https://mail.python.org/mailman/listinfo/pythondotnet

Jeffrey Bush

2014-05-21 17:58:42 UTC

Permalink

You could write a .NET function to do this with fixed pointers and "memcpy"
from the .NET array to the numpy data (the raw data). This would be the
absolute fastest way, but does involve a number of assumptions (for
example that the data in the two arrays are laid out in the same way). If
you want I could probably write something up real quick.

Jeff

Post by Brad Friedman
.net will skip array bounds checking within simple for-loops, as an
optimization. But only if the binaries have all their optimizations turned
on. A binary built for debug has them turned off. There is a huge speed up
for iterating over an array when these optimizations are used. So make sure
you are not looking at a compiler optimization configuration problem.

Post by Dave Cook
I need to copy a .NET Array (e.g. Double[] or Byte[]) to a numpy array,

but it seems the only way to do so is element by element, which is very
slow. Since we are copying a lot of data in real time, it creates a real
bottleneck.

Post by Dave Cook
Alternatively, efficient conversion of the .NET array to a Python style

byte string would allow numpy.fromstring() to be used for creating the
numpy array.

Post by Dave Cook
(I see a similar question went unanswered on the list in August 2011,

but I was hoping someone may have figured it out by now.)

Post by Dave Cook
Thanks,
Dave Cook
_________________________________________________
https://mail.python.org/mailman/listinfo/pythondotnet

_________________________________________________
https://mail.python.org/mailman/listinfo/pythondotnet

Jeffrey Bush

2014-05-21 19:24:21 UTC

Permalink

I was tempted to code it up, and it turns out you can do it in pure python.

I thought of 4 ways to copy the data: using a for loop (like you did),
using numpy.fromiter, using numpy.fromstring, and using Marshal.Copy.

Obviously the for loop is the slowest. numpy.fromiter is still slow, but
~2.5x faster than the for loop (still has all .NET array checks since this
is in Python and the indexers cannot be optimized away). The last two do
direct memory copies (fromstring gets the memory pointer of the .NET array
while Marshal.Copy gets the memory pointer of the NumPy array). They are
both MUCH faster than a for loop, especially for larger arrays (>200x
faster). numpy.fromstring is faster for smaller arrays, but by the time I
got to 10000000 doubles it was twice as slow as Marshal.Copy.

Here is the code:

import clr
from System import Array, Double, IntPtr, Random
import numpy as np
import time

def check_arrays(a, b):
if len(a) != len(b): print("Arrays are different size!")
if any(A != B for A,B in zip(a, b)): print("Arrays have different
values!")

print("Creating source...")
r = Random()
src = Array.CreateInstance(Double, 10000000)
for i in xrange(len(src)): src[i] = r.NextDouble()

print('Copy using for loop'),
start = time.clock()
dest = np.empty(len(src))
for i in xrange(len(src)): dest[i] = src[i]
end = time.clock()
print('in %f sec' % (end-start))
check_arrays(src, dest)

print('Copy using fromiter'),
start = time.clock()
dest = np.fromiter(src, float)
end = time.clock()
print('in %f sec' % (end-start))
check_arrays(src, dest)

print('Copy using fromstring'),
from ctypes import string_at
from System.Runtime.InteropServices import GCHandle, GCHandleType
start = time.clock()
src_hndl = GCHandle.Alloc(src, GCHandleType.Pinned)
try:
src_ptr = src_hndl.AddrOfPinnedObject().ToInt32()
dest = np.fromstring(string_at(src_ptr, len(src)*8)) # note: 8 is size
of double...
finally:
if src_hndl.IsAllocated: src_hndl.Free()
end = time.clock()
print('in %f sec' % (end-start))
check_arrays(src, dest)

print('Copy using Marshal.Copy'),
from System.Runtime.InteropServices import Marshal
start = time.clock()
dest = np.empty(len(src))
Marshal.Copy(src, 0,
IntPtr.__overloads__[int](dest.__array_interface__['data'][0]), len(src))
end = time.clock()
print('in %f sec' % (end-start))
check_arrays(src, dest)

Jeff

Post by Jeffrey Bush
You could write a .NET function to do this with fixed pointers and
"memcpy" from the .NET array to the numpy data (the raw data). This would
be the absolute fastest way, but does involve a number of assumptions (for
example that the data in the two arrays are laid out in the same way). If
you want I could probably write something up real quick.
Jeff

Post by Dave Cook
I need to copy a .NET Array (e.g. Double[] or Byte[]) to a numpy array,

but it seems the only way to do so is element by element, which is very
slow. Since we are copying a lot of data in real time, it creates a real
bottleneck.

Post by Dave Cook
Alternatively, efficient conversion of the .NET array to a Python style

byte string would allow numpy.fromstring() to be used for creating the
numpy array.

Post by Dave Cook
(I see a similar question went unanswered on the list in August 2011,

but I was hoping someone may have figured it out by now.)

Post by Dave Cook
Thanks,
Dave Cook
_________________________________________________
https://mail.python.org/mailman/listinfo/pythondotnet

_________________________________________________
https://mail.python.org/mailman/listinfo/pythondotnet

Dave Cook

2014-05-23 10:34:00 UTC

Permalink

(Sorry for screwing up the thread; I messed up my list subscription.
This is a response to
https://mail.python.org/pipermail/pythondotnet/2014-May/001525.html )

Thanks, Jeffrey, that's awesome. Since the pointer can be directly
accessed, np.frombuffer() can be used to avoid a copy.

src_hndl = GCHandle.Alloc(src, GCHandleType.Pinned)
try:
src_ptr = src_hndl.AddrOfPinnedObject().ToInt32()
bufType = ctypes.c_double*len(src)
cbuf = bufType.from_address(src_ptr)
dest = np.frombuffer(cbuf, dtype=cbuf._type_)
finally:
if src_hndl.IsAllocated: src_hndl.Free()

Dave Cook
_________________________________________________
Python.NET mailing list - PythonDotNet-+ZN9ApsXKcEdnm+***@public.gmane.org
https://mail.python.org/mailman/listinfo/pythondotnet

Jeffrey Bush

2014-05-23 16:27:25 UTC

Permalink

The problem with your current code is that as soon as you call
src_hndl.Free() the pointer is not necessarily valid any more! .NET is
allowed to move the memory contents of objects at will unless they are
pinned (which is what the first line does). By freeing the GCHandle it is
no longer pinned and liable to move/freed as the garbage collector compacts
memory. Also, in 64-bit Python, you will need to call ToInt64() instead of
ToInt32(). In fact, it may always be reasonable to call ToInt64() since
Python's integer type will likely deal with it being 32 or 64-bit
automatically.

So all in all:

- Use ToInt64() instead of ToInt32(), at least in 64-bit Python (e.g.
x.ToInt64() if ctypes.sizeof(ctypes.c_void_p) == 8 else x.ToInt32())
- Do not call GCHandle.Free() until you are completely done with the
memory pointer, but make sure you definitely call it after you are done
with the memory pointer because otherwise the garbage collector can never
free or move the memory that is pinned (resulting in memory leaks or
fragmentation)

Jeff

Post by Dave Cook
(Sorry for screwing up the thread; I messed up my list subscription.
This is a response to
https://mail.python.org/pipermail/pythondotnet/2014-May/001525.html )
Thanks, Jeffrey, that's awesome. Since the pointer can be directly
accessed, np.frombuffer() can be used to avoid a copy.
src_hndl = GCHandle.Alloc(src, GCHandleType.Pinned)
src_ptr = src_hndl.AddrOfPinnedObject().ToInt32()
bufType = ctypes.c_double*len(src)
cbuf = bufType.from_address(src_ptr)
dest = np.frombuffer(cbuf, dtype=cbuf._type_)
if src_hndl.IsAllocated: src_hndl.Free()
Dave Cook
_________________________________________________
https://mail.python.org/mailman/listinfo/pythondotnet