9

By rotating 2 lists either from left to right, Find the smallest possible sum of the absolute value of the differences between each corresponding item in the two lists given they're the same length.

Rotation Sample:

List [0, 1, 2, 3, 4, 5] rotated to the left = [1, 2, 3, 4, 5, 0]

List [0, 1, 2, 3, 4, 5] rotated to the right= [5, 0, 1, 2, 3, 4]

Sum of Absolute Difference:

List 1 = [1, 2, 3, 4]
List 2 = [5, 6, 7, 8]

Sum of Abs. Diff. = |1-5| + |2-6| + |3-7| + |4-8| = 16

Once again, for any arbitrary length of list and integer values, task is to look for the least possible sum by simply rotating to the left/right of either or both lists.

I had no problem with the rotation and acquiring of the minimum sum of absolute difference. I just want to know the smarter approach since my algorithm checks for every possible combination which is quite slow.

Here is my bruteforce approach:

list1 = [45, 21, 64, 33, 49]
list2 = [90, 12, 77, 52, 28]
choices = []                # Put all possible sums into a list to find the minimum value.
for j in range(len(list1)):  # List1 does a full rotation
    total = 0
    for k in range(len(list1)):
        total += abs(list1[k] - list2[k])
    list1.append(list1.pop(0))
    choices.append(total)
print(min(choices))

What's a smarter approach? I would appreciate a shorter code and time complexity as well.

I managed to make it faster by applying generators. Credits to @kuriboh for the idea! But since I'm still new in generator implementation, just want to know if this is the best way of implementing it to reduce my time complexity especially for my loop. Can we still go faster than this configuration?

list1 = [45, 21, 64, 33, 49]
list2 = [90, 12, 77, 52, 28]
choices = []
l = len(list1)
for j in range(l):
    total = sum([abs(int(list1[k])-int(list2[k])) for k in range(l)])
    list1.append(list1.pop(0))
    choices.append(total)
print(min(choices))
11
  • Hint: use the differences bewteen elements within a list. For instance, your list1 above has diffs of [-24, 43, -31, 16, -4]. Look for "fit" patterns with the list2 diffs. Note that the improvement works only if you have very large lists; otherwise, the overhead of pattern matching eats up the gains over your O(N^2) algorithm. Commented Oct 20, 2020 at 0:39
  • 1
    @Prune I'm not quite sure if I completely understood your comment, why are we getting the differences of adjacent integers in list1? Commented Oct 20, 2020 at 0:53
  • 1
    Did you create this problem yourself, or why can't you tell us where it's from? Commented Oct 20, 2020 at 0:55
  • 4
    @Prune I don't see how to take advantage of that, would like to see it :-) Commented Oct 20, 2020 at 1:20
  • 1
    @muw My list2 is just a permutation of your list. When using your list2, the best rotation is [52, 28, 90, 12, 77], which gives a diff list of [-24, 62, -78, 65, -25]. That diff list makes Prune's idea seem promising. My goal was to permute list2 in a way that creates a more challenging test case. Commented Oct 20, 2020 at 2:05

4 Answers 4

3

Since Python treats negative indexes as counting from the right end, you could sum the absolute value of list1 minus (list2 shifted by k) where 0 ≤ k < len(list1) as

sum(abs(list1[i] - list2[i - k]) for i in range(len(list1)))

If you want the minimum of all of these values

length = len(list1)
min(sum(abs(list1[i] - list2[i - k]) for i in range(length))
    for k in range(length))

This code is still O(n^2), but there is a lot less pushing and popping going on.

I really can't think of any way to make the algorithm faster than O(n^2).

Sign up to request clarification or add additional context in comments.

18 Comments

Minus is done with - :-P. And I'm pretty sure those index subtractions cost much more time than the pop+append.
@superbrain the index operations definitely do not take more time than the pop and append. The pop(0) call is O(n), as it is removing the first element of the list.
@EvilTak They definitely do. Overall, pop is done only n times, while the index operations are done n^2 times. So they're doing the same "amount of work", but the pops do so at a far faster lower level.
@superbrain normally I would disagree with you, but given that this is Python, there is a chance that you are right. Still find it hard to believe that the overhead incurred by n extra subtractions outweighs the cost of the memcpy for pop. I'd definitely recommend that OP profile both solutions and pick which one works best for them. Regardless, this answer does do away with the redundant choices list, appending to which at every iteration would incur quite a bit of overhead.
@user3386109 I was hoping you saw that! The edit summary abuse was totally worth it. You wouldn't believe my euphoria at being able to reply to you via another potential edit after seeing your edit summary...
|
3

An optimized mix of your original and Frank's accepted answer:

min(list1.append(list1.pop(0)) or
    sum(abs(x - y) for x, y in zip(list1, list2))
    for _ in list1)

Bit dirty to have the rotation in there like that, but hey, you're asking for "Fastest" :-)

Benchmark with lists of length 1000:

    original     Frank_Yellin   superb_rain  
     127 ms         164 ms         125 ms    
     140 ms         170 ms         117 ms    
     134 ms         166 ms         116 ms    
     124 ms         161 ms         126 ms    
     135 ms         164 ms         126 ms    

Benchmark code:

from timeit import repeat
from random import shuffle

def original(list1, list2):
    choices = []                # Put all possible sums into a list to find the minimum value.
    for j in range(len(list1)):  # List1 does a full rotation
        total = 0
        for k in range(len(list1)):
            total += abs(list1[k] - list2[k])
        list1.append(list1.pop(0))
        choices.append(total)
    return min(choices)

def Frank_Yellin(list1, list2):
    length = len(list1)
    return min(sum(abs(list1[i] - list2[i - k]) for i in range(length))
    for k in range(length))

def superb_rain(list1, list2):
    return min(list1.append(list1.pop(0)) or
               sum(abs(x - y) for x, y in zip(list1, list2))
               for _ in list1)

funcs = [
    (10, original),
    (10, Frank_Yellin),
    (10, superb_rain),
    ]

list1 = list(range(1000))
list2 = list1.copy()
shuffle(list2)

for _, f in funcs:
    print(f(list1, list2))

for _, f in funcs:
    print(f.__name__.center(15), end='')
print()

for _ in range(5):
    for number, f in funcs:
        t = min(repeat(lambda: f(list1, list2), number=number)) / number
        print('%8d ms    ' % (t * 1e3), end='')
    print()

4 Comments

Love this. I thought my code was optimized. I kinda missed the fact that I'm typecasting the lists into integer EVERY iteration. What you did is better. Well done!
@muw Ha, I didn't even realize you did that. What was the point of that? Btw, I think David's answer is the most promising and only interesting one.
I honestly don't understand his process yet, but if you think his idea is better than yours, then I'll pick his answer.
@muw I don't understand his solution, either, but the idea is good and the tests convince me and benchmarks show it's very fast (I'll post another answer about that).
2

I haven't cracked the full problem, but in the special case where the input values are all 0 or 1 (or any two different values, or any of O(1) different values, but we'll need another idea to get much further than that), we can get an O(n log n)-time algorithm by applying fast convolution.

The idea is to compute all of the sums of absolute differences as List1 * reverse(1 - List2) + (1 - List1) * reverse(List2) where 1 - List means doing that operation point-wise and * denotes circular convolution (computable in time O(n log n) using a pair of FFTs). The definition of circular convolution here is

             n-1
             __
             \
(f * g)(i) = /_  f(j) g((i - j) mod n).
             j=0

Substituting List1 for f and reverse(1 - List2) for g, we get

                                  n-1
                                  __
                                  \
(List1 * reverse(1 - List2))(i) = /_ List1(j) (1 - List2((n-1-(i-j)) mod n))
                                  j=0

                                  n-1
                                  __
                                  \
                                = /_ List1(j) (1 - List2((j-(i+1)) mod n)).
                                  j=0

The product List1(j) (1 - List2((j-(i+1)) mod n)) is 1 if and only if List1(j) = 1 and List2((j-(i+1)) mod n) = 0, and 0 otherwise. Thus the i value of the convolution counts the number of places where List1 has a 1 offset i+1 circularly to the left of where List2 has a 0. The other convolution counts 0s corresponding to 1s. Given our input restrictions, this is the sum of absolute differences.

Code:

import numpy


def convolve_circularly(a1, a2):
    return numpy.round(numpy.abs(numpy.fft.ifft(numpy.fft.fft(a1) * numpy.fft.fft(a2))))


def min_sum_abs_diff(a1, a2):
    a1 = numpy.array(a1)
    a2 = numpy.array(a2)[::-1]
    return numpy.min(convolve_circularly(a1, 1 - a2) + convolve_circularly(1 - a1, a2))


def slow_min_sum_abs_diff(a1, a2):
    return min(
        sum(abs(a1[i] - a2[i - k]) for i in range(len(a1))) for k in range(len(a2))
    )


def main():
    n = 100
    for r in range(100000):
        a1 = numpy.random.randint(2, size=n)
        a2 = numpy.random.randint(2, size=n)
        r = min_sum_abs_diff(a1, a2)
        slow_r = slow_min_sum_abs_diff(a1, a2)
        if r != slow_r:
            print(a1, a2, r, slow_r)
            break


if __name__ == "__main__":
    main()

3 Comments

Good idea to try a simpler problem. Sadly I don't understand the math. Would the Python implementation be simple? Not sure what to do with your equations. Would be nice to see it, and then it could be tested.
@superbrain I can try to hack one together using NumPy in a while.
Great, thanks. Looks cool and is indeed very fast (I did some benchmarks in a new answer).
2

Benchmarks with David Eisenstat's solution and two NumPy solutions from me.

500 random ints 0 or 1:

  189.62414 ms  slow_min_sum_abs_diff          # Like Frank's
   49.75403 ms  less_slow_min_sum_abs_diff     # My NumPy
   10.13092 ms  lesser_slow_min_sum_abs_diff   # My Numpy
    0.85030 ms  min_sum_abs_diff               # David's
    0.27434 ms  array_conversion               # for comparison

1000 random ints 0 or 1:

  857.02381 ms  slow_min_sum_abs_diff
  100.26820 ms  less_slow_min_sum_abs_diff
   28.55692 ms  lesser_slow_min_sum_abs_diff
    1.67077 ms  min_sum_abs_diff
    0.49301 ms  array_conversion

1000 random ints from -106 to 106 (without David's, as it's not made for that and would produce wrong results):

  829.18451 ms  slow_min_sum_abs_diff
   89.97418 ms  less_slow_min_sum_abs_diff
   22.69516 ms  lesser_slow_min_sum_abs_diff

I don't understand David's solution, but his own verification is convincing and I did some more myself, comparing each faster solution with the next-slower one (that's actually why I wrote my NumPy solutions, so I could test David's with larger inputs):

passed: less_slow_min_sum_abs_diff (220 tests with n=100)
passed: lesser_slow_min_sum_abs_diff (89 tests with n=300)
passed: min_sum_abs_diff (146 tests with n=1000)
passed: min_sum_abs_diff (5 tests with n=10000)

Benchmarks done on repl.it's Python 3.8.2 64-bit.

The code:

import numpy
from timeit import repeat, default_timer as timer


def array_conversion(a1, a2):
    a1 = numpy.array(a1)
    a2 = numpy.array(a2)


def convolve_circularly(a1, a2):
    return numpy.round(numpy.abs(numpy.fft.ifft(numpy.fft.fft(a1) * numpy.fft.fft(a2))))


def min_sum_abs_diff(a1, a2):
    a1 = numpy.array(a1)
    a2 = numpy.array(a2)[::-1]
    return numpy.min(convolve_circularly(a1, 1 - a2) + convolve_circularly(1 - a1, a2))


def slow_min_sum_abs_diff(a1, a2):
    return min(
        sum(abs(a1[i] - a2[i - k]) for i in range(len(a1))) for k in range(len(a2))
    )


def less_slow_min_sum_abs_diff(a1, a2):
    a1 = numpy.array(a1)
    a2 = numpy.array(a2)
    return min(
        numpy.abs(a1 - numpy.roll(a2, k)).sum()
        for k in range(len(a2))
    )


def lesser_slow_min_sum_abs_diff(a1, a2):
    n = len(a2)
    a1 = numpy.array(a1)
    a2 = numpy.concatenate((a2, a2))
    return min(
        numpy.abs(a1 - a2[k:k+n]).sum()
        for k in range(n)
    )


def random_arrays(n):
    a1 = numpy.random.randint(2, size=n).tolist()
    a2 = numpy.random.randint(2, size=n).tolist()
    return a1, a2


def verify(candidate, reference, n, timelimit=1):
    t0 = timer()
    count = 0
    while timer() - t0 < timelimit:
        a1_orig, a2_orig = random_arrays(n)
        a1, a2 = a1_orig.copy(), a2_orig.copy()
        expect = reference(a1, a2)
        assert a1 == a1_orig and a2 == a2_orig
        result = candidate(a1, a2)
        assert a1 == a1_orig and a2 == a2_orig
        if result != expect:
            print('wrong:')
            print('  expected', expect, 'by', reference.__name__)
            print('  got', result, 'by', candidate.__name__)
            print('  a1:', a1)
            print('  a2:', a2)
            break
        count += 1
    else:
        print('passed:', candidate.__name__, f'({count} tests with {n=})')


def main():
    if 1:
        verify(less_slow_min_sum_abs_diff, slow_min_sum_abs_diff, 100)
        verify(lesser_slow_min_sum_abs_diff, less_slow_min_sum_abs_diff, 300)
        verify(min_sum_abs_diff, lesser_slow_min_sum_abs_diff, 1000)
        verify(min_sum_abs_diff, lesser_slow_min_sum_abs_diff, 10000)
        print()

    funcs = [
        (10, slow_min_sum_abs_diff),
        (100, less_slow_min_sum_abs_diff),
        (100, lesser_slow_min_sum_abs_diff),
        (1000, min_sum_abs_diff),
        (1000, array_conversion),
    ]

    a1, a2 = random_arrays(1000)

    for _ in range(3):
        for number, func in funcs:
            t = min(repeat(lambda: func(a1, a2), number=number)) / number
            print('%11.5f ms ' % (t * 1e3), func.__name__)
        print()


if __name__ == "__main__":
    main()
    print('done')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.