Vectorized “in” function in julia?

前端 未结 5 1252
抹茶落季
抹茶落季 2020-11-28 13:53

I often want to loop over a long array or column of a dataframe, and for each item, see if it is a member of another array. Rather than doing

giant_list =          


        
5条回答
  •  孤街浪徒
    2020-11-28 14:38

    The indexin function does something similar to what you want:

    indexin(a, b)

    Returns a vector containing the highest index in b for each value in a that is a member of b. The output vector contains 0 wherever a is not a member of b.

    Since you want a boolean for each element in your giant_list (instead of the index in good_letters), you can simply do:

    julia> indexin(giant_list, good_letters) .> 0
    3-element BitArray{1}:
      true
     false
     false
    

    The implementation of indexin is very straightforward, and points the way to how you might optimize this if you don't care about the indices in b:

    function vectorin(a, b)
        bset = Set(b)
        [i in bset for i in a]
    end
    

    Only a limited set of names may be used as infix operators, so it's not possible to use it as an infix operator.

提交回复
热议问题