ClojureDocs

Nav

Namespaces

partition-by

clojure.core

Available since 1.2 (source)
  • (partition-by f)
  • (partition-by f coll)
Applies f to each value in coll, splitting it each time f returns a
 new value.  Returns a lazy seq of partitions.  Returns a stateful
 transducer when no collection is provided.
8 Examples
user=> (partition-by #(= 3 %) [1 2 3 4 5])
((1 2) (3) (4 5))
user=> (partition-by odd? [1 1 1 2 2 3 3])
((1 1 1) (2 2) (3 3))

user=> (partition-by even? [1 1 1 2 2 3 3])
((1 1 1) (2 2) (3 3))
;; (this is part of a solution from 4clojure.com/problem 30)
user=> (partition-by identity "Leeeeeerrroyyy")
((\L) (\e \e \e \e \e \e) (\r \r \r) (\o) (\y \y \y))
;; Note that previously created 'bins' are not used when same value is seen again
user=> (partition-by identity "ABBA")
((\A) (\B \B) (\A))

;; That is why you use group-by function if you want all the the same values in the same 'bins' :) 
;; Which gives you a hash, but you can extract values from that if you need.

(group-by identity "ABBA")
=> {\A [\A \A], \B [\B \B]}
;; Arbitrary partitioning
(let [seen (atom true)]
  (partition-by #(cond
                  (#{1} %) (reset! seen (not @seen))
                  (or (and (string? %)
                           (< (count %) 2))
                      (char? %)) "letter"
                  (string? %) "string"
                  (#{0} %) 0
                  (vector? %) (count %)
                  :else "rest")
                [1 1 1 2 3 nil "a" \l 0 4 5 {:a 1} "bc" "aa" "k" [0] [1 1] [2 2]]))
;;=> ((1) (1) (1) (2 3 nil) ("a" \l) (0) (4 5 {:a 1}) ("bc" "aa") ("k") ([0]) ([1 1] [2 2]))
user=> (partition-by count ["a" "b" "ab" "ac" "c"])

;;=> (("a" "b") ("ab" "ac") ("c"))
user=> (partition-by identity [1 1 1 1 2 2 3])
;;=> ((1 1 1 1) (2 2) (3))
I think this is a better (more agnostic & more declarative/functional)
arbitrary partitioning. The ratio of true:false can be tweaked for longer
average sequence length.

(defn arbitrarily-partition [coll]
  (let [signals (take (count coll) (cycle [true false]))
        shuffled (shuffle signals)
        zipped (map vector shuffled coll)
        partitioned (partition-by first zipped)]
    (for [c partitioned]
      (map second c))))

;;=> (arbitrarily-partition (range 100))
((0 1) (2) (3) (4 5) (6 7) (8 9 10) (11) (12) (13 14 15) (16) (17 18) (19) (20) (21) (22 23 24 25 26) (27 28) (29) (30 31) (32) (33 34 35 36 37 38) (39 40 41) (42) (43 44 45 46 47) (48 49) (50) (51 52 53) (54) (55 56) (57 58) (59 60 61) (62) (63) (64) (65 66) (67 68) (69) (70) (71) (72 73 74 75) (76) (77 78) (79) (80) (81 82) (83 84 85) (86) (87 88) (89 90 91 92 93) (94) (95 96 97) (98) (99))
See Also

Returns a lazy sequence of lists of n items each, at offsets step apart. If step is not supplied, ...

Added by boxie

Returns a lazy sequence of lists like partition, but may include partitions with fewer than n item...

Added by boxie

Returns a map of the elements of coll keyed by the result of f on each element. The value at each ...

Added by abrooks

Returns a lazy sequence removing consecutive duplicates in coll. Returns a transducer when no coll...

Added by stellingsimon

Returns a sorted sequence of the items in coll. If no comparator is supplied, uses compare. compa...

Added by MicahElliott

Returns a vector of [(take-while pred coll) (drop-while pred coll)]

Added by mars0i
5 Notes
    By , created 12.6 years ago, updated 12.6 years ago

    It's worth mentioning that (partition-by identity …) is equivalent to the Data.List.group function in Haskell:

     
    (defn group [coll]
      (partition-by identity coll))
    

    Which proves to be an interesting idiom:

    user=> (apply str 
             (for [ch (group "fffffffuuuuuuuuuuuu")] 
               (str (first ch) (count ch))))
    ⇒ "f7u12"
    
    By , created 7.1 years ago

    Many other programming languages like Kotlin or Haskell define partition slightly different. They partition the given collection into two collections, the first containing all truthy values and the second elements all falsy elements. This function does it:

    (defn partition-2
      "Partitions the collection into exactly two [[all-truthy] [all-falsy]]
       collection."
      [pred coll]
      (mapv persistent!
        (reduce
          (fn [[t f] x]
            (if (pred x)
              [(conj! t x) f]
              [t (conj! f x)]))
          [(transient []) (transient [])]
          coll)))
    (partition-2 odd? (range 5))
    
    By , created 7.0 years ago

    I tried this implementation of your Kotlin/Haskell partition, which is simpler but somewhat slower (less than 2x):

    (defn partition-3
      "Partitions the collection into exactly two [[all-truthy] [all-falsy]]
       collection."
      [pred coll]
      (let [m (group-by pred coll)]
        [(m true) (m false)]))
    
    By , created 5.1 years ago, updated 5.1 years ago

    A third implementation which in my limited testing in cljs is the fastest so far:

    (defn split-by
      "Effectively though non-lazily splits the `coll`ection using `pred`,
      essentially like `[(filter coll pred) (remove coll pred)]`"
      [pred coll]
      (let [match (transient [])
            no-match (transient [])]
        (doseq [v coll]
          (if (pred v)
            (conj! match v)
            (conj! no-match v)))
        [(persistent! match) (persistent! no-match)]))
    

    Using simple-benchmark in cljs, these are the results:

    [r (range 1000)], (partition-2 odd? r), 1000 runs, 167 msecs
    [r (range 1000)], (partition-3 odd? r), 1000 runs, 364 msecs
    [r (range 1000)], (split-by odd? r), 1000 runs, 60 msecs
    

    It's worth noting that all implementations of the java-esque partition in this thread are non-lazy.

    On big collections where you don't want to realize the whole list, this is the fastest:

    [(filter odd? r) (filter (complement odd?) r)]
    

    Can also be written as:

    ((juxt filter remove) odd? r)
    

    (taken from: http://blog.jayfields.com/2011/08/clojure-partition-by-split-with-group.html)

    By , created 3.7 years ago

    Perhaps this will be of help someone trying to find the regions denoted by partitions

    (defn partition-at
        "Like partition-by but will start a new run when f returns true"
        [f coll]
        (lazy-seq
          (when-let [s (seq coll)]
            (let [run (cons (first s) (take-while #(not (f %)) (rest s)))]
              (cons run (partition-at f (drop (count run) s)))))))
    

    (taken from: http://cninja.blogspot.com/2011/02/clojure-partition-at.html#comments)