This is the 10th post in our series on Symfony2 components and we will cover the latest component added to Symfony: the ExpressionLanguage component. This component was added in version 2.4 and  provides a way to have dynamic aspects in static configurations. For example, it can be used to evaluate expressions in configuration files, create a DSL ,or build a business rules engine.

The ExpressionLanguage component adds a bit of "color" to your configuration data

The ExpressionLanguage component adds a bit of “color” to static data

Installation

Like the other components, you can install using Composer:

{
    "require": {
        "symfony/expression-language": "2.4.*"
    }
}

First time using Composer? Check out our Composer 101 post.

Simple example

Imagine we want to create a blog system where users can create their own blogs. Also, we would like to give users some flexibility by letting them to define if a given article is featured or not based on almost anything. It could be based on the number of visits that the article has received, the category, or even something weird as the current time. The expression that determines if a given article is featured or not in run time would be saved in the database too.

Doing this in a classic way would be cumbersome, we would need to define fixed rules and force users to choose between one of them… unless we use eval().

// get an article from the blog
$article = $blog->getArticle(1);

// check the values
var_dump($article->getVisits()); // int(15)
var_dump($article->getFeatured()); // bool(false)
var_dump($blog->getFeaturedExpression()); // string(26) "$article->getVisits() > 10"

// calculate whether it is a featured article
$article->setFeatured(eval('return ' . $blog->getFeaturedExpression() . ';'));

// featured changed to true
var_dump($article->getFeatured()); // bool(true)

// render the article
...

The number of visits of the post is 15 and the expression to make it featured is “$article->getVisits() > 10″, so when evaluated returns true. The problem of this approach is that we are using eval() and we all know that eval is evil as it allows execution of arbitrary PHP code. In this example, eval() works fine and adding a return statement we get the result of the comparison “15 > 10″, but that will not be always the case. Since we are letting users define their own expressions that will be executed by the PHP engine, a malicious user could configure his blog with something like “exec(‘rm -fr *’)”.

To quote Rasmus Lerdorf, “if eval() is the answer, you’re almost certainly asking the wrong question”.

The ExpressionLanguage component elegantly solves this issue. Since it has its own engine, no raw PHP is executed. Never. The only operations that will work are those defined and whitelisted. This is same example, but now using the ExpressionLanguage component:


use Symfony\Component\ExpressionLanguage\ExpressionLanguage;

// get an article from the blog
$article = $blog->getArticle(1);

// object values
var_dump($article->getVisits()); // int(15)
var_dump($article->getFeatured()); // bool(false)
var_dump($blog->getFeaturedExpression()); // string(26) "article.getVisits() > 10"

// calculate whether it is a featured article
$language = new ExpressionLanguage();
$article->setFeatured($language->evaluate($blog->getFeaturedExpression(), array(
    'article' => $article
)));

// featured changed to true
var_dump($article->getFeatured()); // bool(true)

// render the article
...

We created an instance of the ExpressionLanguage class to safely evaluate the expression “article.getVisits() > 10″. The evaluate() method evaluates the expression and optionally accepts an array of input parameters. The engine will only have access to the passed parameters, avoiding one of the problems of eval(), which has access to the current scope where is being executed. And it also solves the potential security problem with code execution, as the component does not execute PHP code, but a pseudo-language, which is limited and sandboxed.

Evaluate != compile

The ExpressionLanguage class provides two methods to deal with expressions: evaluate and compile.

The evaluate method evaluates the expression and returns its value. The return value can be a PHP variable of any type, even objects:

var_dump($language->evaluate('value**2', array('value' => 5)));
var_dump($language->evaluate('article.getVisits() > 10', array('article' => $article)));
var_dump(get_class($language->evaluate('article', array('article' => $article))));

The output would be:

int(25)
bool(true)
string(7) "Article"

Moreover, the compile method converts an expression into PHP code, so it can be cached and evaluated later.

var_dump($language->compile('value**2', array('value')));
var_dump($language->compile('article.getVisits() > 10', array('article')));
string(14) "pow($value, 2)"
string(28) "($article->getVisits() > 10)"

Syntax

The syntax is available in the official documentation. There you can find all the literals, operators and accessors available. Just a quick summary:

  • Literals: strings (e.g. ‘hello’ or “hello”), numbers (e.g. 10), arrays (e.g. [1, 2, 3]), hashes (e.g. { name: ‘Raul’ }), booleans (true/false) and null.
  • Operators: arithmetic (+, -, *, /, %, **), bitwise (&, |, ^), comparison (==, ===, !=, !===, <, >, <=, >=, matches – regex -, ?, ?: – ternary -), logical (not, !, and, &&, or, ||), string (~ – concatenation -), array (in, not in) and numeric (.. – ranges -).
  • Accessors: object public properties/methods (e.g. article.title, article.getTitle()) and arrays (e.g. articles[0]).

Functions

By default there is only one function available for using in our expressions: constant. This function wraps the PHP’s constant function, which returns the value of the given constant:

var_dump($language->evaluate('constant("PHP_INT_MAX")')); // int(9223372036854775807)

We can add our own functions to the engine quite easily:

$language = new ExpressionLanguage();

$language->register('sum_digits', function ($str) {
    return sprintf('$sum = (is_string(%1$s)) ? array_sum(str_split(%1$s)) : %1$s; return $sum;', $str);
}, function ($arguments, $str) {
    if (!is_string($str)) {
        return $str;
    }

    return array_sum(str_split($str));
});

Ok, it is a bit confusing why we are kind of “repeating” the function body… the register() method takes three arguments: the name of the function and two closures, one for compiling the function (converting it into PHP code) and another for evaluating. We defined the sum_digits function, which calculates the sum of the digits of a string, and works as expected:

// int(15)
var_dump($language->evaluate('sum_digits("12345")')); 

// string(83) "$sum = (is_string($values)) ? array_sum(str_split($values)) : $values; return $sum;"
var_dump($language->compile('sum_digits(values)', array('values'))); 

Using this idea, and since we are a hosting company, we could provide a way to configure actions based on the servers status:

server.memory_usage > 70 ? send_mail_warning("raul@servergrove.com")
server.memory_usage > 90 ? send_mail_critical("pablo@servergrove.com")
server.disk_usage > 85 ? upgrade(server)
server.php_version < repository.php_version ? upgrade_php(server)

Caching

Parsing expressions can be slow, so the component adds a cache layer to save parsed expressions (ParsedExpression). This way the same expressions are not parsed twice in the same request. This is achieved by the parser cache: ArrayParserCache, which caches parsed expressions in an array.

These parsed expressions can also be persisted to be used between requests. We can implement our own cache layer by implementing the ParserCacheInterface, which has the methods save() and fetch(). For example, to create a simple file cache:

namespace RaulFraile\ExpressionLanguage;

use Symfony\Component\ExpressionLanguage\ParsedExpression;
use Symfony\Component\ExpressionLanguage\ParserCache\ParserCacheInterface;

class FileParserCache implements ParserCacheInterface
{

    protected function getPath($key)
    {
        return sys_get_temp_dir() . '/' . sha1($key);
    }

    public function save($key, ParsedExpression $expression)
    {
        file_put_contents($this->getPath($key), serialize($expression));
    }

    public function fetch($key)
    {
        $path = $this->getPath($key);

        return is_readable($path) ? unserialize(file_get_contents($path)) : null;
    }

}

The save() method saves the serialized parsed expression in a file, while fetch() checks if the file exists and then reads its contents. As the key may not be suitable for file names, we use sha1() to create a hash, that will act as a filename.

To use this parser cache instead of the default one, we inject it when creating the engine object. Both evaluate() and compile() accept strings (what we are passing so far) or ParsedExpression instances:

use Symfony\Component\ExpressionLanguage\ExpressionLanguage;
use RaulFraile\ExpressionLanguage\FileParserCache;

$cache = new FileParserCache();
$language = new ExpressionLanguage($cache);

// parse checks if the expression is cached
$parsedExpression = $language->parse('article.getVisits() > 10', array('article'));

$value = $language->evaluate($parsedExpression, array(
    'article' => $article
));

Internals

Internally, the component is not too different from a usual compiler or interpreter. Let’s dive into it a bit…

Lexer/Tokenizer

A Lexer instance tokenizes an expression, converting a string into a TokenStream, which is basically an array of tokens. Each token has a type and a value.


use Symfony\Component\ExpressionLanguage\ExpressionLanguage;
use Symfony\Component\ExpressionLanguage\Lexer;

require_once __DIR__.'/vendor/autoload.php';

$lexer = new Lexer();
$tokenStream = $lexer->tokenize('1 + 2');

while (!$tokenStream->isEOF()) {
    var_dump($tokenStream->current);

    $tokenStream->next();
}

We get as output the three tokens (instances of Token), two for the numbers and one for the operator:

class Symfony\Component\ExpressionLanguage\Token#3 (3) {
  public $value => int(1)
  public $type => string(6) "number"
  public $cursor => int(1)
}
class Symfony\Component\ExpressionLanguage\Token#4 (3) {
  public $value => string(1) "+"
  public $type => string(8) "operator"
  public $cursor => int(3)
}
class Symfony\Component\ExpressionLanguage\Token#5 (3) {
  public $value => int(2)
  public $type => string(6) "number"
  public $cursor => int(5)
}

Parser

Once the engine has the list of tokens, they must be parsed to build a node tree. The component ships with an operator precedence parser (a bottom-up parser that interprets an operator-precedence grammar). The method used is the “precedence climbing”.

The basic idea behind the parser is that converts a sequence of tokens to a node tree, understanding how operators work and associate to each other (unary/binary, associativity and precedence). For example, the following operations are equivalent:

  • “1 + 2 * 3″ == “1 + (2 * 3)” (*precedence = 60, +precedence = 500)
  • “a or b and c” == “(a or b) and c” (orprecedence = 10, andprecedence = 15)
  • “a**3 + 1″ == “(a * a * a) + 1″ (**precedence = 200, **associativity= right, +precedence = 500)

Finally, this tree is used to evaluate the concrete expression:

use Symfony\Component\ExpressionLanguage\ExpressionLanguage;
use Symfony\Component\ExpressionLanguage\Lexer;
use Symfony\Component\ExpressionLanguage\Parser;

require_once __DIR__.'/vendor/autoload.php';

$expression = '1 + 2';

$lexer = new Lexer();
$tokenStream = $lexer->tokenize((string) $expression);

$parser = new Parser(array());
$nodes = $parser->parse($tokenStream);

var_dump($nodes->evaluate(array(), array())); // int(3)

compile() is faster than evaluate()

It may not be obvious, but actually, compile() is faster than evaluate(). Both methods need to tokenize and parse the expression, but compile() just returns the string containing the PHP code while evaluate() loops through the tree nodes to evaluate them on the fly.

Who’s using it?

The Symfony2 full-stack framework, in the version 2.4, uses expressions extensively in service definitions, access control rules, caching, routing and validation. But as the component is quite new, there are not many projects using it already. Here are a few:

More info

Photo: Camera Shy, by Hair-Flick