Tuesday, January 5, 2010

Symbols

Scala has what are called symbolic literals. A symbol is very similar to a String except that they are cached. So symbol 'hi will be the same object as 'hi declared a second time. In addition there is special syntax for creating symbols that requires only a single quote.

As odd as this may appear at first glance there are some good use cases for symbols.
  • Save memory
  • Extra semantics. Semantically they are the same object where "hi" is not necessarily the same as "hi" so 'hi has some extra semantics meaning this is the one an only 'hi object.
  • Shorter syntax. This can be useful when designing DSLs
  • Identifier syntax. When using the simpler syntax the symbol will be a valid Scala identifier so it can be good when intending to reference methods or variable, perhaps in a heavily reflection based framework


I hope that makes sense :)

  1. // This is the long way to create a symbol object
  2. scala> Symbol("hi")
  3. res10: Symbol = 'hi
  4. // This is the short way.  The symbol must be a legal Scala symbol (like a method name or value/variable name)
  5. scala> 'hi
  6. res11: Symbol = 'hi
  7. // If you *need* characters in a symbol that are not legal in Scala identifiers the Symbol 
  8. // object has a factory method for that purpose
  9. scala> Symbol("hi there")
  10. res12: Symbol = 'hi there
  11. // Not legal
  12. scala> 'hi there
  13. < console>:5: error: value there is not a member of Symbol
  14.        'hi there
  15.            ^
  16. // quotes are not legal for identifiers
  17. scala> '"hi there"
  18. < console>:1: error: unclosed character literal
  19.        '"hi there"
  20.        ^
  21. < console>:1: error: unclosed string literal
  22.        '"hi there"
  23.                  ^
  24. scala> 'hi\ there 
  25. < console>:5: error: value \ is not a member of Symbol
  26.        'hi\ there
  27.        ^
  28. // You can extract out the string from the symbol if desired quite easily
  29. scala> 'hi match { case Symbol(b) => b}        
  30. res14: String = hi
  31. // A simpler way to get the string
  32. scala> 'hi.toString drop 1
  33. res0: String = hi

3 comments:

  1. An even simpler way to get the string:
    'hi.name

    ReplyDelete
  2. Literal String instances are also cached by the compiler. The intern method allows to internalize strings created at runtime to get unique reference.

    So I think that on the JVM symbols are useless. Perhaps they were introduced for .Net sake?

    ReplyDelete
  3. One thing to have in mind: symbols in Scala pre-2.8 are implemented as weak references in a hashmap. This could pose some performance problems.

    One also ought to be careful with frameworks which use reflection to construct objects (like XStream) as this could possibly cause consistency issues when two different symbol objects with the same string exist:

    val s = 'symbol
    println(s == s.getClass.getConstructors().first.newInstance("symbol"))

    ReplyDelete