Jason Sultana

Follow this space for writings (and ramblings) about interesting things related to software development.

Multi-statement queries in Wordpress

26 Sep 2020 » php, wordpress

G’day guys!

Today I wanted to talk a little bit about a conundrum I ran into while working on a Wordpress site. I had a fairly complicated query to run, which involved creating a temporary table, inserting values into the temporary table and then selecting the results. In other words, my procedure if you will actually consisted of multiple individual queries. And this is where Wordpress runs into trouble.

Wordpress!? Yuck!

I know, I know. I did just write a blog post a couple of weeks ago arguing that you should probably be using a static site generator instead of a CMS for your developer blog. But this is a freelance project that I was working on for a client, so it’s a bit of a different story.

So, what’s the problem?

If you’ve worked on a Wordpress project before, you’re probably familiar with the wpdb class that Wordpress provides, which is almost always used to interact with a database on a Wordpress site. This class was based on ezsql, which was written by Justin Vincent many eons ago. If you look at the Wordpress source (specifically in wp-includes/wp-db.php), you’ll notice that it’s basically just a wrapper of the out-of-the-box mysqli php functons, with some Wordpress-specific helper functions and sugar. Unfortunately, wpdb only supports the mysqli_query function, which is limited to executing one query at a time. In order to run multiple queries (eg: a CREATE TEMPORARY TABLE and a SELECT FROM), we need to be using mysqli_multi_query.

Wait…what?

I wouldn’t blame you if you’ve never heard of that function. I hadn’t until recently either, although I’ll admit that I don’t use PHP often these days. Anyway though, the general consensus appears to be that mysqli_query will not execute multiple queries in an attempt to prevent SQL injections. For example, if you had a login form with an SQL injection vulnerability, you might be executing something like:

Original Query

SELECT * FROM Users WHERE username = '$userName' AND password = '$password';

but this would be susceptible to executing something like:

Hacked Query

SELECT * FROM Users WHERE username = ''; DROP TABLE USERS; --' AND password = '';

So by preventing executing multiple queries in one go, they’re preventing situations like this. But that’s actually not enough, since it doesn’t cater for situations like:

Hacked Query

SELECT * FROM Users WHERE username = '' OR 1 == 1; --' AND password = '';

So preventing multiple queries alone doesn’t protect against SQL injection. But we can at least understand why out-of-the-box, PHP’s database query functions don’t support multiple queries in one go, and wpdb doesn’t either (since it’s just a wrapper for mysqli, as mentioned).

Okay, so what are our options?

Well, you could just use the mysqli_multi_query function directly, without using wpdb at all. Though I’m actually not too sure to be honest how that function would work with prepared statements, which is even more important to do considering that multiple queries are allowed, so our earlier example of the DROP TABLE injection would work if the query was unprotected. Also, it’s worth noting that this function does also have some interesting caveats, like running asynchronously, requiring you to flush the results before running individual queries, etc. You’ll need to take these into consideration before going down this road.

Personally, I was in a bit of a time pinch when I ran into this problem, so I ended up just writing my own helper class that supports executing multiple queries, including prepared statements. This might be useful for somebody else out there running into this problem (especially if you’re stuck in the Wordpress ecosystem) and even for myself potentially somewhere down the line, so here it is. Backlinks are appreciated, but no attribution is required :)

<?php
    /**
        * Given an SQL query containing a WHERE IN (@collection) clause,
        * replace @collection with %d for each item in the collection. Eg:
        * 
        * If @collection = [1,2,3], WHERE IN (@collection) = WHERE IN (%d,%d,%d) 
        */
    public static function parse_in_clause($sql, $placeholder, $values) {
        // Strip the comments out of the sql. We can probably just take the sql after */
        $queryStartPos = strpos($sql, '*/') + 2;
        $sql = substr($sql, $queryStartPos);

        $in  = str_repeat('%d,', count($values) - 1) . '%d'; // assume the values are all integers
        $sql = str_replace($placeholder, $in, $sql);

        return $sql;
    }
    
    /**
        * Given an SQL query containing a WHERE IN (@collection) clause,
        * replace @collection with [@collection0], [@collection1]...[n]. Eg:
        * 
        * If @collection = [1,2,3], WHERE IN (@collection) = WHERE IN ([@collection0],[@collection1],[@collection2])
        */
    public static function explode_in_clause($sql, $placeholder, $values) {
        // Strip the comments out of the sql. We can probably just take the sql after */
        $queryStartPos = strpos($sql, '*/') + 2;
        $sql = substr($sql, $queryStartPos);

        // Transform IN (@id) into IN ([@id1], [@id2], [@id]...n) for each value
        // We need to contain each value in square brackets so that @id1 doesn't get matched against @id11 later.
        $in = '';
        for ($i = 0; $i < sizeof($values); $i++) {
            $in .= '[' . $placeholder . $i . ']';

            $isLastItem = ($i == sizeof($values) - 1);
            if (!$isLastItem) {
                $in .= ',';
            }
        }

        $sql = str_replace($placeholder, $in, $sql);

        return $sql;            
    }
    
    /**
        * Given an array of $values, return an associative array
        * that can be prepared with the result of explode_in_clause.
        * 
        * i.e - if $placeholder = @someValue and $values = [1,2,3],
        * return ['[@someValue0]' = 1, '[@someValue1]' = 2, '[@someValue2]' = 3]
        */
    public static function explode_args($placeholder, $values) {
        $res = array();

        for ($i = 0; $i < sizeof($values); $i++) {
            // We need to enclose the key in square brackets so that @id1 doesn't get
            // mixed up with @id11
            $key = '[' . $placeholder . $i . ']';
            $res[$key] = $values[$i];
        }

        return $res;
    }
    
    /**
        * Execute an $sql query composed of multiple parts (statements),
        * which is not supported natively by wpdb.
        * 
        * The query will be prepared using values from $args and print-f formats
        * from $formats, which should be parralel associative arrays.
        */
    public static function execute_multipart_query($sql, $args, $formats) {
        // WPDB doesn't support multi-statement queries out-of-the-box.
        // Split by semicolon and execute one-by-one.
        // Return the results of the last query.
        $parts = explode(';', $sql);

        $query;
        $statement;
        foreach($parts as $part) {
            // If the subquery is empty, skip it. Most likely to happen on the last iteration if
            // there is a trailing semicolon.
            if (strlen(trim($part)) == 0) {
                continue;
            }

            // get an array of args for each placeholder in the query
            $prepared_args = prepare_args($part, $args); 

            // get a query where variables are replaced with placeholders
            $prepared_sql = prepare_sql($part, $formats); 

            global $wpdb;

            // If there are any args for this subquery, prepare it. Otherwise, run it raw.
            $statement = sizeof($prepared_args) > 0 ?
                $wpdb->prepare($prepared_sql, $prepared_args) :
                $prepared_sql;

            $query = $wpdb->query($statement);
        }

        if (!$query) {
            error_log("wpdb last error: " . $wpdb->last_error);
            return array();
        }
        else {
            $results = $wpdb->get_results($statement);
            return $results;
        }
    }

    private static function prepare_args($sql, $args) {
        $indexes = array(); // hold "index" => "value" values
                            // eg: "27" => "value1", "35" => "value2", "88" => "value1"
    
        foreach ($args as $key => $value) {
            $instancePositions = strpos_recursive($sql, $key);
    
            // Add the positions to the indexes
            foreach ($instancePositions as $pos) {
                $indexes[$pos] = $value;
            }
        }
    
        ksort($indexes); // sort the array by keys
        return $indexes;
    }

    private static function prepare_sql($sql, $args) {
        foreach ($args as $key => $value) {
            // Replace all instances of the variable (key) with the supplied placeholder (value)
            $sql = str_replace($key, $value, $sql);	
        }
    
        return $sql;
    }

    private static function strpos_recursive($haystack, $needle, $offset = 0, &$results = array()) {               
        $offset = strpos($haystack, $needle, $offset);
        if($offset === false) {
            return $results;           
        } else {
            $results[] = $offset;
            return strpos_recursive($haystack, $needle, ($offset + 1), $results);
        }
    }

Whew! There is a fair bit going on here, but the usage is pretty simple. For example:

<?php
    $sql = 'SOME_STRING_WITH_MULTIPLE_QUERIES';
    $results = execute_multipart_query($sql, array(
        '@arg1' => $arg1,
        '@arg2' => $arg2,
        '@arg3' => $arg3
    ), array(
        '@arg1' => '%f',
        '@arg2' => '%f',
        '@arg3' => '%d'
    ));

Basically you just need to supply a multipart sql query string, supply an associative array of placeholders and their corresponding arguments (which will be properly prepared) and their formats.

WHERE IN (...) clauses need to be handled a little bit differently. Here’s another example:

<?php
    $sql = 'SOME_STRING_WITH_MULTIPLE_QUERIES';
    
    // Explode the IN clause, so you get `WHERE IN (@placeholder1, @placeholder...n)`.
    $sql = explode_in_clause($sql, '@placeholder', $values);
    
    // Explode the args, so you get an associative array of `['@placeholder1' => x, '@placeholder2' => y...]`
    $args = explode_args('@placeholder', $values);
    
    // Get an associative array of formats to use. Assumes that they're all integers.
    $formats = $args; // copy the array (this is by value by default)
    $keys = array_keys($args);
    foreach($keys as $key) {
        $formats[$key] = '%d';
    }

    // Run the query
    $results = execute_multipart_query($sql, $args, $formats);

In these cases, we’ll need to dynamically construct the IN (...) part of the SQL query, populate an associative array that can be used to prepare the query, and then execute it.

Anyway, that’s all from me for today. Please let me know in the comments if you found this useful :)

Catch ya!

Further reading

  • https://stackoverflow.com/questions/10924127/why-cant-i-run-two-mysqli-queries-the-second-one-fails
  • https://stackoverflow.com/questions/22531943/speed-best-practice-flushing-mysqli-multi-query
  • https://wordpress.org/support/topic/wpdb-gtquery-fails-when-multiple-update-statements-are-used/
  • https://wordpress.stackexchange.com/questions/304981/how-to-execute-mulitple-statement-sql-queries-using-wpdb-query