SonarSQL Functionality

SonarSQL primarily translates MySQL SELECT statements.

SELECT Syntax

SonarSQL can translate statements with the following syntax:

SELECT
  [DISTINCT]
  select_expr [, select_expr ...]
  [FROM table_references
    [WHERE where_condition]
    [GROUP BY {col_name | expr | position}, ...]
    [HAVING where_condition]
    [ORDER BY {col_name | position} [ASC | DESC], ...]
    [LIMIT {[offset,] row_count | row_count OFFSET offset}]]

select_expr

Using * is generally supported. However, mixing * with other items is not supported even if it is qualified e.g. tblname.*. It is also not supported in statements involving UNION.

[AS] alias is generally supported. Using an existing column name as an alias for a different column is unsupported.

FROM table_references

Selecting from DUAL is supported.

Table names can have an alias specified:

tbl_name [[AS] alias]

Qualified table names (with a database name) are unsupported. The qualifier is simply ignored and the default/current database will be assumed.

A single table subquery is supported, however the statement cannot involve UNION syntax, or any joins.

See also JOIN Syntax.

GROUP BY and ORDER BY

Column name, column aliases, and column positions are generally supported. While column positions can be used when * is specified in the select list, the order of columns are returned is alphabetical rather than schema-defined.

Columns from a dimension table (see JOIN Syntax) may appear in both GROUP BY and ORDER BY.

If GROUP BY is used without ORDER BY, the output will not be sorted according to columns in GROUP BY.

JOIN Syntax

The syntax translatable in table_references is:

table_references:
    table_factor
  | table_factor, table_factor
  | table_factor [INNER | CROSS] JOIN table_factor [ON conditional_expr]
  | table_factor LEFT [OUTER] JOIN table_factor [ON conditional_expr]
  | (subquery) alias

table_factor:
  table_name [[AS alias]]

When there is a join, SonarSQL automatically determines the fact table and the dimension table. Joins often are between a large table (fact table/collection), and a small lookup table (dimension table/collection). Whichever collection has more documents will be considered the fact table.

SonarSQL attempts to determine the table to which an unqualified column belongs. However, it is strongly recommended that all columns whether from the fact or dimension table be qualified throughout a query involving a join. If an alias is defined for the table, the column must be qualified using the alias rather than the table name.

The underlying collection in SonarW that corresponds to a dimension table cannot contain any arrays.

Unrestricted cross joins are not supported, i.e. at least one join condition must be specified, whether using JOIN...ON syntax or in the WHERE clause. Each side of the condition may be simply a field or an expression involving fields from the same table. The comparison operator must be one of =, <, <=, >, >=, or BETWEEN. Multiple conditions may be specified with the use of AND.

A single subquery is supported in the FROM clause.

Supported usage of columns from the dimension table in various clauses (GROUP, HAVING, ORDER, LIMIT) is basic.

UNION Syntax

Supported in the same syntax as in MySQL.

As opposed to MySQL, with SonarSQL, a statement cannot involve both UNION ALL and UNION DISTINCT. Specifying a single UNION DISTINCT will apply distinct to the entire union.

INSERT...SELECT Syntax

SonarSQL can create a new collection populated by a SELECT statement:

INSERT
  [INTO] tbl_name
  SELECT ...

In the SELECT statement here, UNION is not supported.

Window Functions

Window functions provide the ability to perform calculations across sets of documents that are related to the current document. This is comparable to the type of calculation that can be done with an aggregate function. But unlike regular aggregate functions, use of a window function does not cause documents to become grouped into a single output document — the documents retain their separate identities. Behind the scenes, the window function is able to access more than just the current document.

Window Function Examples

The following examples make use of this sample database. Insert these json rows into the SonarW database directly, prior to issuing sonarSQL queries:

db.sales.insert([
{ "_id": 1, "item" : 21, "price": 2020.59, "name" : "Laptop MC2010", "cust" : 1},
{ "_id": 2, "item" : 35, "price": 4068.23, "name" : "Desktop UU4200", "cust" : 2},
{ "_id": 3, "item" : 12, "price": 899.99, "name" : "Monitor EG880", "cust" : 2},
{ "_id": 4, "item" : 76, "price": 25.67,  "name" : "Keyboard PW100", "cust" : 3}
])

db.customers.insert([
{ "_id": 1,  "name" : "Jill Brand", "customer_no" : 1},
{ "_id": 2,  "name" : "Jim Hill", "customer_no" : 2},
{ "_id": 3,  "name" : "Joe Smith", "customer_no" : 3}
])

In the first example, we’ll calculate the average price, as a running average of all rows before the current row:

MySQL [test]> SELECT name, item, AVG(price)  OVER(ORDER BY name)  AS AVG_PRICE_ALIAS FROM sales;
+----------------+------+-----------------+
| name           | item | AVG_PRICE_ALIAS |
+----------------+------+-----------------+
| Desktop UU4200 |   35 |         4068.23 |
| Keyboard PW100 |   76 |         2046.95 |
| Laptop MC2010  |   21 |    2038.1633333 |
| Monitor EG880  |   12 |         1753.62 |
+----------------+------+-----------------+
4 rows in set (0.06 sec)

Next, we reverse the order of the rows processed, changing the order to descending:

MySQL [test]> SELECT name, item, AVG(price)  OVER(ORDER BY name DESC)  AS AVG_PRICE_ALIAS FROM sales;
+----------------+------+-----------------+
| name           | item | AVG_PRICE_ALIAS |
+----------------+------+-----------------+
| Monitor EG880  |   12 |          899.99 |
| Laptop MC2010  |   21 |         1460.29 |
| Keyboard PW100 |   76 |    982.08333333 |
| Desktop UU4200 |   35 |         1753.62 |
+----------------+------+-----------------+
4 rows in set (0.07 sec)

Next, lets add a PARTITION BY clause: each average is computed just for the sales of one customer:

MySQL [test]> SELECT name, cust, AVG(price)  OVER(PARTITION BY cust ORDER BY name)  AS AVG_PRICE_ALIAS FROM sales;
+----------------+------+-----------------+
| name           | cust | AVG_PRICE_ALIAS |
+----------------+------+-----------------+
| Desktop UU4200 |    2 |         4068.23 |
| Monitor EG880  |    2 |         2484.11 |
| Laptop MC2010  |    1 |         2020.59 |
| Keyboard PW100 |    3 |           25.67 |
+----------------+------+-----------------+
4 rows in set (0.07 sec)

We’ll compute the average, considering only one preceding an one following row:

MySQL [test]> SELECT name, item, AVG(price)  OVER(ORDER BY name  ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING )  AS AVG_PRICE_ALIAS FROM sales;
+----------------+------+-----------------+
| name           | item | AVG_PRICE_ALIAS |
+----------------+------+-----------------+
| Desktop UU4200 |   35 |         2046.95 |
| Keyboard PW100 |   76 |    2038.1633333 |
| Laptop MC2010  |   21 |    982.08333333 |
| Monitor EG880  |   12 |         1460.29 |
+----------------+------+-----------------+
4 rows in set (0.06 sec)

Now, a more complete example using a join and computing the standard deviation:

MySQL [test]> SELECT c.name,s.name, s.item, STD(s.price)  OVER(ORDER BY c.name)  AS STD_PRICE_ALIAS FROM sales s,customers c WHERE s.cust=c._id;
+------------+----------------+------+-----------------+
| name       | name           | item | STD_PRICE_ALIAS |
+------------+----------------+------+-----------------+
| Jill Brand | Laptop MC2010  |   21 |               0 |
| Jim Hill   | Desktop UU4200 |   35 |    1311.7553391 |
| Jim Hill   | Monitor EG880  |   12 |    1311.7553391 |
| Joe Smith  | Keyboard PW100 |   76 |    1511.8852315 |
+------------+----------------+------+-----------------+
4 rows in set (0.11 sec)

Finally, lets look at the ranking of different sales, ordered by price:

MySQL [test]> SELECT name, cust, PERCENT_RANK()  OVER(ORDER BY price)  AS PERCENT_RANK FROM sales;
+----------------+------+-----------------+
| name           | cust |   PERCENT_RANK  |
+----------------+------+-----------------+
| Keyboard PW100 |    3 |               0 |
| Monitor EG880  |    2 |   0.33333333333 |
| Laptop MC2010  |    1 |   0.66666666667 |
| Desktop UU4200 |    2 |               1 |
+----------------+------+-----------------+
4 rows in set (0.06 sec)

Window Function Syntax

The syntax translatable in analytic_function is:

window_function:
  analytic_function([ arguments ]) OVER (analytic_clause)

analytic_clause:
  [ query_partition_clause ] [ order_by_clause [ windowing_clause ] ]

query_partition_clause:
  PARTITION BY { expr[, expr ]... | ( expr[, expr ]... ) }

order_by_clause:  ORDER BY { expr | position | c_alias } [ ASC | DESC ] [,...]

windowing_clause: { ROWS | RANGE }
  { BETWEEN
      { UNBOUNDED PRECEDING | CURRENT ROW | value_expr { PRECEDING | FOLLOWING } }
    AND
      { UNBOUNDED FOLLOWING | CURRENT ROW | value_expr { PRECEDING | FOLLOWING } }
  }
  | { UNBOUNDED PRECEDING | CURRENT ROW | value_expr PRECEDING }

Analytic Functions

Analytic functions followed by an asterisk (*) allow the use of a windowing_clause. The windowing_clause is otherwise ignored.

SonarSQL supports the following analytic functions:

AVG*

The syntax translatable in avg_analytic is:

avg_analytic:
    AVG( expr) [ OVER(analytic_clause) ]

Computes the average value of expr. expr may be any expression of any numeric data type.

The function returns a value of type float.

MIN*

The syntax translatable in min_analytic is:

min_analytic:
    MIN( expr) [ OVER(analytic_clause) ]

Computes the minimum value of expr.

Returns the same type as the type of the expression expr.

MAX*

The syntax translatable in max_analytic is:

max_analytic:
   MAX( expr) [ OVER(analytic_clause) ]

Computes the maximum value of expr.

Returns the same type as the type of the expression expr.

STDDEV*

The syntax translatable in stddev_analytic is:

stddev_analytic:
   STDDEV( expr) [ OVER(analytic_clause) ]

Computes the standard deviation (population) value of expr.

The function returns a value of type float.

STDDEV_SAMP*

The syntax translatable in stddev_samp_analytic is:

stddev_samp_analytic:
   STDDEV_SAMP( expr) [ OVER(analytic_clause) ]

Computes the standard deviation (sample) value of expr.

The function returns a value of type float.

STDDEV_POP*

The syntax translatable in stddev_pop_analytic is:

stddev_pop_analytic:
   STDDEV_POP( expr) [ OVER(analytic_clause) ]

Computes the standard deviation (population) value of expr.

The function returns a value of type float.

VARIANCE*

The syntax translatable in variance_analytic is:

variance_analytic:
   VARIANCE( expr) [ OVER(analytic_clause) ]

Computes the standard deviation (population) value of expr.

The function returns a value of type float.

VAR_SAMP*

The syntax translatable in var_samp_analytic is:

var_samp_analytic:
   VAR_SAMP( expr) [ OVER(analytic_clause) ]

Computes the variance (sample) value of expr.

The function returns a value of type float.

VAR_POP*

The syntax translatable in var_pop_analytic is:

var_pop_analytic:
   VAR_POP( expr) [ OVER(analytic_clause) ]

Computes the variance (population) value of expr.

The function returns a value of type float.

ROW_NUMBER

The syntax translatable in row_number_analytic is:

row_number_analytic:
   ROW_NUMBER( )
       OVER ([ query_partition_clause ] order_by_clause)

Assigns a unique number to each row, beginning with 1 and incrementing by one for each row.

The function returns a value of type int.

RANK

The syntax translatable in rank_analytic is:

rank_analytic:
   RANK( )
       OVER ([ query_partition_clause ] order_by_clause)

Computes the rank of each row within all rows within the window and sorted based on the ORDER BY clause.

The function returns a value of type float.

PERCENT_RANK

The syntax translatable in percent_rank_analytic is:

percent_rank_analytic:
   PERCENT_RANK( )
       OVER ([ query_partition_clause ] order_by_clause)

Returns a number between 0 and 1. The first row has a PERCENT_RANK of 0 and the last row 1.

The function returns a value of type float.

CUME_DIST

The syntax translatable in cume_dist_analytic is:

cume_dist_analytic:
   CUME_DIST() OVER ([ query_partition_clause ] order_by_clause)

Calculates the cumulative distribution of a value in a group of values. Returns a number larger than 0 and smaller or equal to 1.

The function returns a value of type float.

GROUP_CONCAT

The syntax translatable in group_concat_analytic is:

group_concat_analytic:
   GROUP_CONCAT(measure_expr [, 'delimiter'])
   WITHIN GROUP (order_by_clause) [OVER query_partition_clause]

Concatenate all values of the measure_expr in the window into a single list per row.

The type of the list elements is the same type as the original measure_expr.

Limitations

SonarSQL supports basic window functions with the following restrictions and
limitations:
  • At most one window function per query or sub-query.
  • A window function must appear in the SELECT clause.
  • Window functions follow GROUP and HAVING and preceed ORDER BY.
  • A sub query is necessary in order to filter on the result of a window function.
  • SonarSQL does not support an ORDER BY containing either a NULLS FIRST or NULLS LAST clause.
  • SonarSQL does not support an ORDER BY containing a SIBLINGS clause.
  • SonarSQL does not support the use of DISTINCT or ALL in any analytic function.

Functions and Operators

Type Conversion in Expression Evaluation

Different data types are generally not converted implicitly. Some conversion can be explicitly done using CAST() or CONVERT().

Comparison Functions and Operators

  • BETWEEN ... AND ..., NOT BETWEEN ... AND ...

    Arguments can be numbers, characters, strings, or datetime types. In the case of characters and strings, the comparison is case-sensitive.

  • COALESCE()

  • =

  • >=

  • >

  • IN(), NOT IN()

    Can only be used in a condition in a WHERE or HAVING clause or in the SELECT list. In a condition, the argument may be an array or a subselect. When used in the SELECT list, argument must be an array.

  • IS NULL, IS NOT NULL, ISNULL()

  • <=

  • <

  • LIKE, NOT LIKE

    Can only be used with string arguments.

  • !=, <>

Logical Operators

  • AND, &&
  • OR, ||

Control Flow Functions

  • CASE

  • IF()

  • IFNULL()

    See general differences between MySQL and SonarSQL regarding NULL value handling.

String Functions

  • CONCAT() Arguments can be strings or numbers.
  • INSTR()
  • LEFT() If any argument is NULL returns empty string.
  • LENGTH(), OCTET_LENGTH() Argument can be a string.
  • LOCATE(), POSITION()
  • LOWER(), LCASE()
  • LTRIM()
  • RIGHT() If any argument is NULL, returns empty string.
  • RTRIM()
  • SUBSTRING(), MID(), SUBSTR() Position argument must be > 0.
  • TRIM() Only translates stripping whitespace, ignores remstr if specified.
  • UPPER(), UCASE()

Numeric Functions and Operators

  • /
  • - (binary and unary)
  • %, MOD, MOD()
  • +
  • * (multiplication)
  • PI()
  • ROUND()

  • TRUNCATE()

    Both ROUND() and TRUNCATE work with expressions, but not literals. (SNRD-2376)

Date functions

If a translated function does not require () in MySQL, then the unparenthesized version is also translated. Generally, only dates beyond January 1 1970 are supported. Milliseconds are generally unsupported.

  • CURDATE(), CURRENT_DATE() Returns string ‘YYYY-MM-DD’
  • CURTIME(), CURRENT_TIME() Returns string ‘HH:MM:SS’
  • DATE() Returns string ‘YYYY-MM-DD’ if used in select list at the top level with a non-literal argument, otherwise an internal datetime with 00:00:00 time
  • DATE_DIFF()
  • DATE_ADD(), DATE_SUB(), ADDDATE(), SUBDATE() Returns string ‘YYYY-MM-DD HH:MM:SS’. Supported for units:
    • SECOND
    • MINUTE
    • HOUR
    • DAY
    • WEEK
    • MONTH
    • QUARTER
    • YEAR
    • YEAR_MONTH interval given as ‘YEARS-MONTHS’
  • DATE_FORMAT() Format specifier characters
Specifier Description
%a converted to %w; day of week, numeric (0=Sunday..6=Saturday)
%b converted to %m; month, numeric (00..12)
%c converted to %m; month, numeric (00..12)
%D converted to %d; day of month (00..31)
%d day of month (00..31)
%e converted to %d; day of month (00..31)
%f converted to returning milliseconds (000..999)
%H hour (00..23)
%h converted to %h; hour (00..23)
%I converted to %h; hour (00..23)
%i minutes (00..59)
%j day of year (001..366)
%k converted to %h; hour (00..23)
%l converted to %h; hour (00..23)
%M converted to %m; month, numeric (00..12)
%m month, numeric (00..12)
%p unsupported, empty string
%r converted to %T; time, 24-hour (hh:mm:ss)
%S seconds (00..59)
%s seconds (00..59)
%T time, 24-hour (hh:mm:ss)
%U week (00..53), where Sunday is the first day of the week
%u converted to %U; week (00..53), where Sunday is the first day of the week
%V converted to %U; week (00..53), where Sunday is the first day of the week
%v converted to %U; week (00..53), where Sunday is the first day of the week
%W converted to %w; day of week, numeric (0=Sunday..6=Saturday)
%w day of week, numeric (0=Sunday..6=Saturday)
%X unsupported, empty string
%x unsupported, empty string
%Y year, four digits
%y converted to %Y; year, four digits
  • DAYOFMONTH(), DAY() If day part of date is 0, returns 1
  • DAYOFWEEK()
  • DAYOFYEAR()
  • EXTRACT() Supported for units:
    • YEAR
    • QUARTER
    • MONTH
    • WEEK
    • DAY
    • HOUR
    • MINUTE
    • SECOND
    • YEAR_MONTH
    • DAY_HOUR
    • DAY_MINUTE
    • DAY_SECOND
    • HOUR_MINUTE
    • HOUR_SECOND
    • MINUTE_SECOND
  • HOUR() Input is a datetime <bug number here>
  • MINUTE() Input is a datetime <bug number here>
  • MONTH() If month part of date is 0, returns 1
  • NOW(), CURRENT_TIMESTAMP(), LOCALTIME(), LOCALTIMESTAMP() Returns string ‘YYYY-MM-DD HH:MM:SS’.
  • QUARTER() If month part of date is 0, returns 1
  • SECOND() Input is a datetime <bug number here>
  • STR_TO_DATE() Only handles date formats, not time; returns ISODate displayed as string ‘YYYY-MM-DD 00:00:00’.
  • TIMESTAMPADD() Returns string ‘YYYY-MM-DD HH:MM:SS’. Supported for units:
    • SECOND
    • MINUTE
    • HOUR
    • DAY
    • WEEK
    • MONTH
    • QUARTER
    • YEAR
  • TIME() Returns string ‘HH:MM:SS’ if used in select list at the top level with a non-literal argument, otherwise an internal datetime with 1970-01-01 date part
  • TIMEDIFF()
  • TIMESTAMPDIFF() Supported for same units as TIMESTAMPADD()
  • UTC_DATE() Returns string ‘YYYY-MM-DD’
  • UTC_TIME() Returns string ‘HH:MM:SS’
  • UTC_TIMESTAMP Returns string ‘YYYY-MM-DD HH:MM:SS’
  • WEEK() Functions in mode 0, does not accept mode argument
  • YEAR()

Cast functions

  • CAST(expr AS type), CONVERT(expr, type) Supported types:
  • CHAR without length
  • DATE Returns datetime with 0 time part. Only basic string formats supported as input.
  • DATETIME
  • DECIMAL without lengths
  • SIGNED, INTEGER, INT Rounds decimal places rather than truncating. Where MySQL returns 0 on input that cannot be converted, SonarSQL returns NULL.
  • TIME Returns datetime with 1970-01-01 date part.

Information Functions

For these functions, the parentheses are required.

  • CONNECTION_ID()
  • CURRENT_USER() Returns username only
  • DATABASE(), SCHEMA()
  • USER(), SESSION_USER(), ``SYSTEM_USER() Behaves the same as CURRENT_USER()
  • VERSION()

Aggregate Functions

  • AVG()
  • COUNT()
  • COUNT(DISTINCT)
  • MAX()
  • MIN()
  • SUM()

Miscellaneous Functions

  • EXPLAIN - show SonarW pipeline information per query.
  • SHOW TABLES
  • SHOW DATABASES
  • SHOW FIELDS
  • SHOW PROCESSLIST
  • CREATE USER

Variables

  • @@version_comment
  • @maxTimeMS - the maximum time allowed for a query in SonarW

Information Schema

The information_schema database is a system database that allows users to gather information about the system.

SonarW does not have such a concept, but these tables are emulated by SonarSQL to some degree.