mapfilter

March 3, 2009 at 5:06 pm (erlang, programming) (, , , )

Have you ever mapped a list, then filtered it? Or filtered first, then mapped? Why not do it all in one pass with mapfilter?

mapfilter?

mapfilter is a function that combines the traditional map & filter of functional programming by using the following logic:

  1. if your function returns false, then the element is discarded
  2. any other return value is mapped into the list

Why?

Doing a map and then a filter is O(2N), whereas mapfilter is O(N). That’s twice as a fast! If you are dealing with large lists, this can be a huge time saver. And for the case where a large list contains small IDs for looking up a larger data structure, then using mapfilter can result in half the number of database lookups.

Obviously, mapfilter won’t work if you want to produce a list of boolean values, as it would filter out all the false values. But why would you want to map to a list booleans?

Erlang Code

Here’s some erlang code I’ve been using for a while:

mapfilter(F, List) -> lists:reverse(mapfilter(F, List, [])).

mapfilter(_, [], Results) ->
        Results;
mapfilter(F, [Item | Rest], Results) ->
        case F(Item) of
                false -> mapfilter(F, Rest, Results);
                Term -> mapfilter(F, Rest, [Term | Results])
        end.

Has anyone else done this for themselves? Does mapfilter exist in any programming language? If so, please leave a comment. I think mapfilter is a very simple & useful concept that should be a included in the standard library of every (functional) programming language. Erlang already has mapfoldl (map-reduce in one pass), so why not also have mapfilter?

5 Comments

  1. Gleb Peregud said,

    Have you looked at lists:mapflat/2? mapfilter/2 is equivalent to mapflat/2 with Fun returning [] or [X]. Though the former could be a little bit faster

  2. Jacob said,

    I think you mean flatmap/2, but no, I wasn’t aware of it. I’d think mapfilter/2 would still be faster, but perhaps I’ll run some benchmarks to be sure. Thanks for pointing that out.

  3. willdampier said,

    In python I’ve found generators to be amazingly useful. I do something like this:
    def mapfilter(KEY_FUN, LIST):
    for this_item in LIST:
    res = KEY_FUN(this_item)
    if res:
    yield res
    This also removes any instances where KEY_FUN returns None as well. This also works well if LIST is another generator since it only evaluates the function as generator.next() is called. It really helps if my intermediate objects start causing memory errors.

  4. lamvak said,

    just mind that the better way to lay out the matches would be smth like:
    mapfilter(F, [Item | Rest], Results) ->
    case F(Item) of
    {ok, Term} -> mapfilter(F, Rest, [Term | Results]);
    skip -> mapfilter(F, Rest, Results)
    end.
    the skip is actually unimportant; could be _ if the “true” clause is beforehand;
    but the {ok, Term} is essential, because this way one could return both {ok, false} and even {ok, skip} meaning: this is the actual result and it’s false or skip – respectively
    now one should only respect this in funs and we don’t loose the flexibility

  5. Le Viet Bach said,

    lists:reverse adds another pass through the whole list.
    Still, your code creates less garbage.
    Usually when I need mapfilter, I just use list comprehension.

Leave a comment