2^n Itertools combinations with advanced filtering
I know that I can use itertools to pump out combinations, and define the size of the combination group, like so:
import itertools
print list(itertools.combinations(['V','M','T','O','Q','K','D','R'], 4))
The output of this would be like a list of tuples, each of length 4 in this case.
From here, what I'd like to do is enforce 2 parameters - 1)exclude any combinations/tuples that contain certain pairs - both V and M for instance, or Q and K. 2)Force each tuple to contain only 1 instance of a letter. I believe that itertools is already doing #2.
What should remain are only those tuple groups that do not contain any of these pre-determined "false" pairs. So if I excluded a group that contained V and M, the group ('V','M','Q','D')
would be invalid, but ('V','R','Q','D')
would be valid.
What's the best way for me to do this?
you can define a validate function and filter with that
>>> import itertools
>>> def is_valid(data):
if 'V' in data and 'M' in data:
return False
if 'Q' in data and 'K' in data:
return False
return True
>>> filter(is_valid,itertools.combinations('VMTOQKDR', 4))
[('V', 'T', 'O', 'Q'), ('V', 'T', 'O', 'K'), ('V', 'T', 'O', 'D'), ('V', 'T', 'O', 'R'), ('V', 'T', 'Q', 'D'), ('V', 'T', 'Q', 'R'), ('V', 'T', 'K', 'D'), ('V', 'T', 'K', 'R'), ('V', 'T', 'D', 'R'), ('V', 'O', 'Q', 'D'), ('V', 'O', 'Q', 'R'), ('V', 'O', 'K', 'D'), ('V', 'O', 'K', 'R'), ('V', 'O', 'D', 'R'), ('V', 'Q', 'D', 'R'), ('V', 'K', 'D', 'R'), ('M', 'T', 'O', 'Q'), ('M', 'T', 'O', 'K'), ('M', 'T', 'O', 'D'), ('M', 'T', 'O', 'R'), ('M', 'T', 'Q', 'D'), ('M', 'T', 'Q', 'R'), ('M', 'T', 'K', 'D'), ('M', 'T', 'K', 'R'), ('M', 'T', 'D', 'R'), ('M', 'O', 'Q', 'D'), ('M', 'O', 'Q', 'R'), ('M', 'O', 'K', 'D'), ('M', 'O', 'K', 'R'), ('M', 'O', 'D', 'R'), ('M', 'Q', 'D', 'R'), ('M', 'K', 'D', 'R'), ('T', 'O', 'Q', 'D'), ('T', 'O', 'Q', 'R'), ('T', 'O', 'K', 'D'), ('T', 'O', 'K', 'R'), ('T', 'O', 'D', 'R'), ('T', 'Q', 'D', 'R'), ('T', 'K', 'D', 'R'), ('O', 'Q', 'D', 'R'), ('O', 'K', 'D', 'R')]
>>>
EDIT
To make it even more flexible, you can make a function that make validation functions, and mix it with the idea of @PadraicCunningham of using set
>>> import itertools
>>> def make_validator(*checks):
checker=[set(x) for x in checks]
def validator(data):
return not any( st.issubset(data) for st in checker)
return validator
>>> is_valid = make_validator("VM","QK")
>>> filter(is_valid, itertools.combinations('VMTOQKDR', 4))
[('V', 'T', 'O', 'Q'), ('V', 'T', 'O', 'K'), ('V', 'T', 'O', 'D'), ('V', 'T', 'O', 'R'), ('V', 'T', 'Q', 'D'), ('V', 'T', 'Q', 'R'), ('V', 'T', 'K', 'D'), ('V', 'T', 'K', 'R'), ('V', 'T', 'D', 'R'), ('V', 'O', 'Q', 'D'), ('V', 'O', 'Q', 'R'), ('V', 'O', 'K', 'D'), ('V', 'O', 'K', 'R'), ('V', 'O', 'D', 'R'), ('V', 'Q', 'D', 'R'), ('V', 'K', 'D', 'R'), ('M', 'T', 'O', 'Q'), ('M', 'T', 'O', 'K'), ('M', 'T', 'O', 'D'), ('M', 'T', 'O', 'R'), ('M', 'T', 'Q', 'D'), ('M', 'T', 'Q', 'R'), ('M', 'T', 'K', 'D'), ('M', 'T', 'K', 'R'), ('M', 'T', 'D', 'R'), ('M', 'O', 'Q', 'D'), ('M', 'O', 'Q', 'R'), ('M', 'O', 'K', 'D'), ('M', 'O', 'K', 'R'), ('M', 'O', 'D', 'R'), ('M', 'Q', 'D', 'R'), ('M', 'K', 'D', 'R'), ('T', 'O', 'Q', 'D'), ('T', 'O', 'Q', 'R'), ('T', 'O', 'K', 'D'), ('T', 'O', 'K', 'R'), ('T', 'O', 'D', 'R'), ('T', 'Q', 'D', 'R'), ('T', 'K', 'D', 'R'), ('O', 'Q', 'D', 'R'), ('O', 'K', 'D', 'R')]
>>> filter(make_validator("VM","QK",'MT',"DR"), itertools.combinations('VMTOQKDR', 4))
[('V', 'T', 'O', 'Q'), ('V', 'T', 'O', 'K'), ('V', 'T', 'O', 'D'), ('V', 'T', 'O', 'R'), ('V', 'T', 'Q', 'D'), ('V', 'T', 'Q', 'R'), ('V', 'T', 'K', 'D'), ('V', 'T', 'K', 'R'), ('V', 'O', 'Q', 'D'), ('V', 'O', 'Q', 'R'), ('V', 'O', 'K', 'D'), ('V', 'O', 'K', 'R'), ('M', 'O', 'Q', 'D'), ('M', 'O', 'Q', 'R'), ('M', 'O', 'K', 'D'), ('M', 'O', 'K', 'R'), ('T', 'O', 'Q', 'D'), ('T', 'O', 'Q', 'R'), ('T', 'O', 'K', 'D'), ('T', 'O', 'K', 'R')]
>>>
the make_validator
take as arguments any number of the exclusion combinations that you don't want and make a function that do the check and return it; you can keep the result in a variable or use it directly
I would filter with a set:
import itertools
c = itertools.combinations(['V','M','T','O','Q','K','D','R'], 4)
st = {"V","M"}
print([co for co in c if not st.issubset(co)])
If you want to filter on a two:
st1 = {"V","M"}
st2 = {"Q","K"}
print([co for co in c if not st1.issubset(co) and not st2.issubset(co)])
If you have more than two it would be probably nicer to use any
:
sts = [{"V","M"},{"V","R"},{"T","O"}]
print([co for co in c if not any(st.issubset(co) for st in sts)])
Bar you roll your own combination logic, you cannot avoid creating all the combinations and filtering, even if you do roll your own doing it in pure python would probably be slower bar you had a large dataset
You can use a list comprehension with an if
condition:
>>> [x for x in itertools.combinations('VMTOQKDR', 4)
if not (('V' in x and 'M' in x) or ('Q' in x and 'K' in x))]
[('V', 'T', 'O', 'Q'),
('V', 'T', 'O', 'K'),
... 37 more ...
('O', 'Q', 'D', 'R'),
('O', 'K', 'D', 'R')]
Note, however, that this will still go through all the combinations and just "throw away" the invalid ones. Eg, if the first two elements are ('V','M')
it will continue generating `('V','M','O','R') and throw it away, and so on. For the number of combinations generated in this case that is not a problem. For larger combinations, you might want to throw out invalid partial results earlier, using a custom recursive algorithm.