Monday, August 6, 2012

Scala-IO Core: Long Traversable

The LongTraversable trait is one of the most important objects in Scala IO. Input provides a uniform way of creating views on the data (as a string or byte array or LongTraversable of something like bytes.)

LongTraversable is a scala.collection.Traversable with some extra capabilities. A few of the salient points of LongTraversable are:
  • It is a lazy/non-strict collection similar to Stream. In other words, you can perform operations like map, flatmap, filter, collect, etc... without accessing the resource
  • Methods like slice and drop will (if possible for the resource) skip the dropped bytes without reading them
  • Each usage of the LongTraversable will typically open and close the underlying resource.
  • Has methods that one typically finds in Seq.  For example: zip, apply, containsSlice
  • Has methods that take or return Longs instead of Ints like ldrop, lslice, ltake, lsize
  • Has limitFold method that allows fold like behaviour with extra features like skip and early termination
  • Can be converted to an AsyncLongTraversable which has methods that return Futures instead and won't block the program
  • Can be converted to a Process object for advanced data processing pipelines
Example usage:

The limitFold method can be quite useful to process only a portion of the file if you don't know ahead of time what the indices of the portion are:

1 comment:

  1. I think
    // print out second 5 characters
    should be:
    // print out second 5 lines

    ReplyDelete