Literals and Special Syntactic Rules

Integers

Integers can be written in various forms and number bases:

Examples Explanation
1234 -123 Regular decimal notation
#b0 #b10101 Binary notation
#0 #10101 Binary notation (alternative form)
#o377 #o-111 Octal notation
#d123456789 #d+123 Explicitly decimal notation
#xc0ffe 0x-01 Hexadecimal notation
#2r1010 #8r377 #36rhelloworld Notation with explicit base (up to 36)
#\a #$ #\ä #\🐭 Character notation (the value is the Unicode code point of the character
#\x1f42d; Character notation with the value in hexadecimal

In all these forms, the case of the indicating letter is not significant, i.e. #b1010 and #B1010 are identical as are #16rf00 and #16Rf00.

Similarly, the case is not significant for digits beyond 9 (i.e. 'a', 'b', 'c', … for number bases larger than 10), e.g. #xabcd is the same as #xABCD and can even be mixed in the same number, e.g. #36rHelloWorld is valid and the same number as #36Rhelloworld and #36rHELLOWORLD.

The character notation using hexadecimal code representation (#\x....;) is basically the same thing as the regular hexadecimal notation #x.... except that it conveys to the reader that a character is intended and that it does a sanity check on the value (e.g. negative numbers and value outside the Unicode range are not permitted).

Floating point numbers

There is only one type of floating point numbers and the literals are written in the usual way, e.g. these are all valid floating point numbers:

1.0 +1.0 -1.0 1.0e10 1.111e-10

The one thing to watch out for is that you cannot omit the the part before or after the decimal point if it is zero. E.g. the following are not valid forms: 100. or .125.

Strings

There are two forms of strings: list strings and binary strings.

List Strings

List strings are just lists of integers (where the values have to be from a certain set of numbers that are considered valid characters) but they have their own syntax for literals (which will also be used for integer lists as an output representation if the list contents looks like it is meant to be a string): "any text between double quotes where \" and other special characters like \n can be escaped".

As a special case you can also write out the character number in the form \xHHH; (where "HHH" is an integer in hexadecimal notation), e.g. "\x61;\x62;\x63;" is a complicated way of writing "abc". This can be convenient when writing Unicode letters not easily typeable or viewable with regular fonts. E.g. "Cat: \x1f639;" might be easier to type (and view on output devices without a Unicode font) than "Cat: 😹".

Binary Strings

Binary strings are just like list strings but they are represented differently in the virtual machine. The simple syntax is #"...", e.g. #"This is a binary string \n with some \"escaped\" and quoted (\x1f639;) characters"

You can also use the general format for creating binaries (#B(...), described below), e.g. #B("a"), #"a", and #B(97) are all the same binary string.

Character Escaping

Certain control characters can be more readably included by using their escaped name:

Escaped name Character
\b Backspace
\t Tab
\n Newline
\v Vertical tab
\f Form Feed
\r Carriage Return
\e Escape
\s Space
\d Delete

Alternatively you can also use the hexadecimal character encoding, e.g. "a\nb" and "a\x0a;b" are the same string.

Binaries

We have already seen binary strings, but the #B(...) syntax can be used to create binaries with any contents. Unless the contents is a simple integer you need to annotate it with a type and/or size.

Example invocations are that show the various annotations:

Expression Result
#B(42 (42 (size 16)) (42 (size 32))) #B(42 0 42 0 0 0 42)
#B(-42 111 (-42 (size 16)) 111 (-42 (size 32))) #B(-42 111 (-42 (size 16)) 111 (-42 (size 32)))
#B((42 (size 32) big-endian) (42 (size 32) little-endian)) #B(0 0 0 42 42 0 0 0)
#B((1.23 float) 111 (1.23 (size 32) float) 111 (1.23 (size 64) float)) #B(63 243 174 20 122 225 71 174 111 63 157 112 164 111 63 243 174 20 122 225 71 174)
#B((#"a" binary) (#"b" binary)) #"ab"
#B("Cat:" #\ (128569 utf-8)) #"Cat: 😹"

Learn more about "segments" of binary data in the Erlang teaching book.

Lists

Lists are formed either as ( ... ) or [ ... ] where the optional elements of the list are separated by some form or whitespace.

E.g. () (the empty list), (foo bar baz), or

(foo
bar
baz)

Tuples

Tuples are written as #(value1 value2 ...). The empty tuple #() is also valid.

Maps

Maps are written as #M(key1 value1 key2 value2 ...) (again, the empty map is also valid and written as #M().

Symbols

Things that cannot be parsed as any of the above are usually considered as a symbol.

Simple examples are foo, Foo, foo-bar, :foo. But also somewhat surprisingly 123foo and 1.23e4extra (but note that illegal digits don't make a number a symbol when using the explicit number base notation, e.g. #b10foo gives an error).

Symbol names can contain a surprising breadth or characters:

!, #, $, %, &, ', *, +, ,, -, ., /, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, :, <, =, >, ?, @, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z, \, ^, _, ` , a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z, |, ~, \, ¡, ¢, £, ¤, ¥, ¦, §, ¨, ©, ª, «, ¬, , ®, ¯, °, ±, ², ³, ´, µ, , ·, ¸, ¹, º, », ¼, ½, ¾, ¿, À, Á, Â, Ã, Ä, Å, Æ, Ç, È, É, Ê, Ë, Ì, Í, Î, Ï, Ð, Ñ, Ò, Ó, Ô, Õ, Ö, ×, Ø, Ù, Ú, Û, Ü, Ý, Þ, ß, à, á, â, ã, ä, å, æ, ç, è, é, ê, ë, ì, í, î, ï, ð, ñ, ò, ó, ô, õ, ö, ÷, ø, ù, ú, û, ü, ý, þ, ÿ

(This is basically all of the latin-1 character set without control character, whitespace, the various brackets, double quotes and semicolon).

Of these, only |, \', ', ,, and # may not be the first character of the symbol's name (but they are allowed as subsequent letters).

I.e. these are all legal symbols: foo, foo,, µ#, ±1, 451°F.

Symbols can be explicitly constructed by wrapping their name in vertical bars, e.g. |foo|, |symbol name with spaces|. In this case the name can contain any character of in the range from 0 to 255 (or even none, i.e. || is a valid symbol). The vertical bar in the symbol name needs to be escaped: |symbol with a vertical bar \| in its name| (similarly you will obviously have to escape the escape character as well).

Comments

Comments come in two forms: line comments and block comments.

Line comments start with a semicolon (;) and finish with the end of the line.

Block comments are written as #| comment text |# where the comment text may span multiple lines but my not contain another block comment, i.e. it may not contain the character sequence #|.

Evaluation While Reading

#.(... some expression ...). E.g. '#.(+ 1 1) will evaluate the (+ 1 1) while it reads the expression and then be effectively '2.