A nice way to generate all subsets within python.

Disclaimer: this post has some lousy code and a serious factual error. Throughout, I am speaking about a method of generating all permutations of a set, while saying that I’m speaking about generating all subsets of a set! I have written a newer post which integrates content from a comment by Owen Jacobson, along with a short response explaining just what the heck I was thinking. This post has been left intact aside from this disclaimer for the sake of posterity!

Recently on hackernews, I saw this amazing bunch of vimfu (learned some nifty tricks from it too!), and so I went to check out Jim Dennis’ other answers. Nothing quite so popular, but some worthy reads about “what qualifies a person as a kernel developer” and “can I run a .so compiled in red hat under ubuntu?” – the one that inspired this post though, is the classic “how to generate all subsets of a set?

Jim suggests using python’s itertools module (in addition to bitmasking on the subset-representation), and I’ve never had cause to look at itertools before. Boy, how I wish I had! I quickly whipped up the following (after reading about argument packing/unpacking:

There’s probably reasons why it’s ugly, and certainly ways to do it better. What I like about this though is that it handles pretty much anything you throw at it, aside from a single non-iterable object I suppose! So calling “all_subsets(1,2,3,4)”, or “all_subsets(‘funny’)”, or “all_subsets([1,2,3,4])” or “all_subsets({‘a’:1,’b’:2})”, or even all_subsets(1) all work and spit out a list of tuples of the permutations.

A possible extension would be to convert back into the form originally provided – but I think that a standard form of return may be more useful. Regardless, this is a neat tool that I will keep in my box.

Edit: I made a modification to the above, so that it can work with any iterable or non-iterable object:

3 thoughts on “A nice way to generate all subsets within python.

  1. Three points:

    1. from imports in functions have gone away in Python 3; I’d put those at module scope, instead. The other alternative is to use import collections as c within the function and to qualify references, as with c.Iterable.

    (I know, imports within functions are an attractive idea, but the resulting semantics are very odd. The first call to a function might, or might not, alter the interpreter’s global state depending on what other imports have already run, and might, or might not, take significantly longer than usual depending on how quickly imports are resolved! Not fun, as I learned to my detriment.)

    2. This seems like a great place to use generators and the yield expression. It’d put the decision about whether or not to hold n! permutations in memory or simultaneously or not out in the caller’s context, where there’s more information to make that decision against, and it’d replace four lines of code with one: return c.permute(args).

    3. Neither function actually returns all of the subsets of the set formed by their respective argument lists; they return all permutations, instead. Compare the output of either function with the output of

    def all_subsets(*elements):
    # ensure elements is indexable and contains only unique values
    elements = list(set(elements))

    # Each bit pattern in an n-bit bitfield corresponds to a unique subset
    # of an n-element set. Iterate over all such bitfields where
    # n = len(elements).
    for permutation_bits in xrange(2**len(elements)):
    indexes = [
    i
    for i in xrange(len(elements))
    if (1 << i) & permutation_bits != 0
    ]
    yield set(elements[i] for i in indexes)

    You may want to use list(all_subsets(1, 2, 3, 4)) to unpack the resulting generator.

    (Here’s hoping WordPress didn’t mangle my Python…)

Leave a Reply

Your email address will not be published.