Distinct sub sequences summing to given number in an array

During my current preparation for interview, I encountered a question for which I am having some difficulty to get optimal solution,
We are given an array A and an integer Sum , we need to find all distinct sub sequences of A whose sum equals Sum .
For eg. A={1,2,3,5,6} Sum=6 then answer should be
{1,2,3}
{1,5}
{6}

Presently I can think of two ways of doing this,

  • Use Recursion ( which I suppose should be last thing to consider for an interview question)
  • Use Integer Partitioning to partition Sum and check whether the elements of partition are present in A
  • Please guide my thoughts.


    I agree with Jason. This solution comes to mind:
    (complexity is O(sum*|A|) if you represent the map as an array)

  • Call the input set A and the target sum sum
  • Have a map of elements B, with each element being x:y , where x (the map key) is the sum and y (the map value) is the number of ways to get to it.
  • Starting of, add 0:1 to the map - there is 1 way to get to 0 (obviously by using no elements)
  • For each element a in A, consider each element x:y in B.
  • If x+a > sum , don't do anything.
  • If an element with the key x+a already exists in B, say that element is x+a:z , modify it to x+a:y+z .
  • If an element with the key doesn't exist, simply add x+a:y to the set.
  • Look up the element with key sum , thus sum:x - x is our desired value.
  • If B is sorted (or an array), you can simply skip the rest of the elements in B during the "don't do anything" step.

    Tracing it back:

    The above just gives the count, this will modify it to give the actual subsequences.

    At each element in B, instead of the sum, store all the source elements and the elements used to get there (so have a list of pairs at each element in B).

    For 0:1 there is no source elements.

    For x+a:y , the source element is x and the element to get there is a .

    During the above process, if an element with the key already exists, enqueue the pair x/a to the element x+a (enqueue is an O(1) operation).

    If an element with the key doesn't exist, simply create a list with one pair x/a at the element x+a .

    To reconstruct, simply start at sum and recursively trace your way back.

    We have to be careful of duplicate sequences (do we?) and sequences with duplicate elements here.

    Example - not tracing it back:

    A= {1,2,3,5,6}
    sum = 6

    B = 0:1

    Consider 1
    Add 0+1
    B = 0:1, 1:1

    Consider 2
    Add 0+2:1 , 1+2:1
    B = 0:1, 1:1, 2:1, 3:1

    Consider 3
    Add 0+3:1 (already exists -> add 1 to it), 1+3:1 , 2+1:1 , 3+1:1
    B = 0:1, 1:1, 2:1, 3:2, 4:1, 5:1, 6:1

    Consider 5
    B = 0:1, 1:1, 2:1, 3:2, 4:1, 5:2, 6:2
    Generated sums thrown away = 7:1, 8:2, 9:1, 10:1, 11:1

    Consider 6
    B = 0:1, 1:1, 2:1, 3:2, 4:1, 5:2, 6:3
    Generated sums thrown away = 7:1, 8:1, 9:2, 10:1, 11:2, 12:2

    Then, from 6:3 , we know we have 3 ways to get to 6.

    Example - tracing it back:

    A= {1,2,3,5,6}
    sum = 6

    B = 0:{}

    Consider 1
    B = 0:{}, 1:{0/1}

    Consider 2
    B = 0:{}, 1:{0/1}, 2:{0/2}, 3:{1/2}

    Consider 3
    B = 0:{}, 1:{0/1}, 2:{0/2}, 3:{1/2,0/3}, 4:{1/3}, 5:{2/3}, 6:{3/3}

    Consider 5
    B = 0:{}, 1:{0/1}, 2:{0/2}, 3:{1/2,0/3}, 4:{1/3}, 5:{2/3,0/5}, 6:{3/3,1/5} Generated sums thrown away = 7, 8, 9, 10, 11

    Consider 6
    B = 0:{}, 1:{0/1}, 2:{0/2}, 3:{1/2,0/3}, 4:{1/3}, 5:{2/3,0/5}, 6:{3/3,1/5,0/6} Generated sums thrown away = 7, 8, 9, 10, 11, 12

    Then, tracing back from 6 : (not in {} means an actual element, in {} means a map entry)

    {6}
      {3}+3
        {1}+2+3
          {0}+1+2+3
          1+2+3
          Output {1,2,3}
        {0}+3+3
          3+3
          Invalid - 3 is duplicate
      {1}+5
        {0}+1+5
          1+5
          Output {1,5}
      {0}+6
        6
          Output {6}
    

    This is a variant of the subset-sum problem. The subset-sum problem asks if there is a subset that sums to given a value. You are asking for all of the subsets that sum to a given value.

    The subset-sum problem is hard (more precisely, it's NP-Complete) which means that your variant is hard too (it's not NP-Complete, because it's not a decision problem, but it is NP-Hard).

    The classic approach to the subset-sum problem is either recursion or dynamic programming. It's obvious how to modify the recursive solution to the subset-sum problem to answer your variant. I suggest that you also take a look at the dynamic programming solution to subset-sum and see if you can modify it for your variant (tbc: I do not know if this is actually possible). That would certainly be a very valuable learning exercise whether or not is possible as it would certainly enhance your understanding of dynamic programming either way.

    It would surprise me though, if the expected answer to your question is anything but the recursive solution. It's easy to come up with, and an acceptable approach to the problem. Asking for the dynamic programming solution on-the-fly is a bit much to ask.

    You did, however, neglect to mention a very naïve approach to this problem: generate all subsets, and for each subset check if it sums to the given value or not. Obviously that's exponential, but it does solve the problem.


    I assumed that given array contains distinct numbers. Let's define function f(i, s) - which means we used some numbers in range [1, i] and the sum of used numbers is s.

    Let's store all values in 2 dimensional matrix ie in cell (i, j) we will have value for f(i, j). Now if have already calculated values for cells which are located upper or lefter the cell (i, s) we can calculate value for f(i, s) ie f(i, s) = f(i - 1, s);(not to take i indexed number) and if(s >= a[i]) f(i, s) += f(i - 1, s - a[i]). And we can use bottom-up approach to fill all the matrix, setting [f(0, 0) = 1; f(0, i) = 0; 1 <= i <= s], [f(i, 0) = 1;1<=i<=n;]. If we calculated all the matrix then we have answer in cell f(n,S); Thus we have total time complexity O(n*s) and memory complexity O(n*s);

    We can improve memory complexity if we note that in every iteration we need only information from previous row, it means that we can store matrix of size 2xS not nxS. We reduced memory complexity up to linear to S. This problem is NP complete thus we don't have polynomial algorithm for this and this approach is the best thing.

    链接地址: http://www.djcxy.com/p/39984.html

    上一篇: Java循环编译错误

    下一篇: 不同的子序列在数组中求和给定的数字