Thursday, December 10, 2009

Xml Transformation 1

Unlike most Java Xml Apis the Scala Object model consists of immutable object. This has two major consequences:
  • There is no reference to the parent node because that would cause the XML to be very expensive during transformations
  • Transforming the XML requires creating new nodes rather than changing the existing nodes

Both point cause non-functional programmers to feel a little uneasy but in practice only the first restriction causes any real discomfort.

Two methods for XML transformation will be demonstrated in this and the next topic.
  1. scala> val xml = <library>
  2.      | <videos>
  3.      | <video type="dvd">Seven</video>
  4.      | <video type="blue-ray">The fifth element</video>
  5.      | <video type="hardcover">Gardens of the moon</video>
  6.      | </videos>
  7.      | <books>
  8.      | <book type="softcover">Memories of Ice</book>
  9.      | </books>
  10.      | </library>
  11. xml: scala.xml.Elem = 
  12. <library>
  13.        <videos>
  14.        <video type="dvd">Seven</video>
  15.        <video type="blue-ray">The fifth element</video>
  16.        <video type="hardcover">Gardens of the moon</video>
  17.        </videos>
  18.        <books>
  19.        <book type="softcover">Memories of Ice</book>
  20.        </books>
  21.        </library>
  22. scala> import scala.xml._
  23. import scala.xml._
  24. scala> import scala.xml.transform._
  25. import scala.xml.transform._
  26. // Some of the books are labelled as videos
  27. // not books so lets select those elements
  28. scala> val mislabelledBooks = xml \\ "video" filter {e => (e \\ "@type").text == "hardcover"}
  29. mislabelledBooks: scala.xml.NodeSeq = <video type="hardcover">Gardens of the moon</video>
  30. // we can create a rule that will remove all the
  31. // selected elements
  32. scala> object RemoveMislabelledBooks extends RewriteRule {
  33.      | override def transform(n: Node): Seq[Node] ={ 
  34.      | if (mislabelledBooks contains n) Array[Node]()
  35.      | else n
  36.      | }
  37.      | }
  38. defined module RemoveMislabelledBooks
  39. // a quick test to make sure the elements are removed
  40. scala> new RuleTransformer(RemoveMislabelledBooks)(xml)
  41. res1: scala.xml.Node = 
  42. <library>
  43.        <videos>
  44.        <video type="dvd">Seven</video>
  45.        <video type="blue-ray">The fifth element</video>
  46.        
  47.        </videos>
  48.        <books>
  49.        <book type="softcover">Memories of Ice</book>
  50.        </books>
  51.        </library>
  52. // Now another rule to add them back
  53. scala> object AddToBooks extends RewriteRule {                             
  54.      | override def transform(n: Node): Seq[Node] = n match {                                
  55.      | case e:Elem if(e.label == "books") =>                                                 
  56.      |   val newBooks = mislabelledBooks map { case e:Elem => e.copy(label="book") }
  57.      |   e.copy(child = e.child ++ newBooks)                                                 
  58.      | case _ => n
  59.      | }
  60.      | }
  61. defined module AddToBooks
  62. // voila done
  63. scala> new RuleTransformer(RemoveMislabelledBooks, AddToBooks)(xml) 
  64. res4: scala.xml.Node = 
  65. <library>
  66.        <videos>
  67.        <video type="dvd">Seven</video>
  68.        <video type="blue-ray">The fifth element</video>
  69.        </videos>
  70.        <books>
  71.        <book type="softcover">Memories of Ice</book>
  72.        <book type="hardcover">Gardens of the moon</book></books>
  73.        </library>

2 comments:

  1. You missed an "import scala.xml.transform._". Also, I'd write one line inside AddToBooks in a different manner:

    val newBooks = mislabelledBooks map { case mB: Elem => mB.copy(label="book") }

    ReplyDelete
  2. Thanks for both pointers. I have incorporated them into the post

    ReplyDelete