Contents
SonarSQL primarily translates MySQL SELECT statements.
SonarSQL can translate statements with the following syntax:
SELECT
[DISTINCT]
select_expr [, select_expr ...]
[FROM table_references
[WHERE where_condition]
[GROUP BY {col_name | expr | position}, ...]
[HAVING where_condition]
[ORDER BY {col_name | position} [ASC | DESC], ...]
[LIMIT {[offset,] row_count | row_count OFFSET offset}]]
Using * is generally supported. However, mixing * with other items is not supported even if it is qualified e.g. tblname.*. It is also not supported in statements involving UNION.
[AS] alias is generally supported. Using an existing column name as an alias for a different column is unsupported.
Selecting from DUAL is supported.
Table names can have an alias specified:
tbl_name [[AS] alias]
Qualified table names (with a database name) are unsupported. The qualifier is simply ignored and the default/current database will be assumed.
A single table subquery is supported, however the statement cannot involve UNION syntax, or any joins.
See also JOIN Syntax.
Column name, column aliases, and column positions are generally supported. While column positions can be used when * is specified in the select list, the order of columns are returned is alphabetical rather than schema-defined.
Columns from a dimension table (see JOIN Syntax) may appear in both GROUP BY and ORDER BY.
If GROUP BY is used without ORDER BY, the output will not be sorted according to columns in GROUP BY.
The syntax translatable in table_references is:
table_references:
table_factor
| table_factor, table_factor
| table_factor [INNER | CROSS] JOIN table_factor [ON conditional_expr]
| table_factor LEFT [OUTER] JOIN table_factor [ON conditional_expr]
| (subquery) alias
table_factor:
table_name [[AS alias]]
When there is a join, SonarSQL automatically determines the fact table and the dimension table. Joins often are between a large table (fact table/collection), and a small lookup table (dimension table/collection). Whichever collection has more documents will be considered the fact table.
SonarSQL attempts to determine the table to which an unqualified column belongs. However, it is strongly recommended that all columns whether from the fact or dimension table be qualified throughout a query involving a join. If an alias is defined for the table, the column must be qualified using the alias rather than the table name.
The underlying collection in SonarW that corresponds to a dimension table cannot contain any arrays.
Unrestricted cross joins are not supported, i.e. at least one join condition must be specified, whether using JOIN...ON syntax or in the WHERE clause. Each side of the condition may be simply a field or an expression involving fields from the same table. The comparison operator must be one of =, <, <=, >, >=, or BETWEEN. Multiple conditions may be specified with the use of AND.
A single subquery is supported in the FROM clause.
Supported usage of columns from the dimension table in various clauses (GROUP, HAVING, ORDER, LIMIT) is basic.
Supported in the same syntax as in MySQL.
As opposed to MySQL, with SonarSQL, a statement cannot involve both UNION ALL and UNION DISTINCT. Specifying a single UNION DISTINCT will apply distinct to the entire union.
SonarSQL can create a new collection populated by a SELECT statement:
INSERT
[INTO] tbl_name
SELECT ...
In the SELECT statement here, UNION is not supported.
Window functions provide the ability to perform calculations across sets of documents that are related to the current document. This is comparable to the type of calculation that can be done with an aggregate function. But unlike regular aggregate functions, use of a window function does not cause documents to become grouped into a single output document — the documents retain their separate identities. Behind the scenes, the window function is able to access more than just the current document.
The following examples make use of this sample database. Insert these json rows into the SonarW database directly, prior to issuing sonarSQL queries:
db.sales.insert([
{ "_id": 1, "item" : 21, "price": 2020.59, "name" : "Laptop MC2010", "cust" : 1},
{ "_id": 2, "item" : 35, "price": 4068.23, "name" : "Desktop UU4200", "cust" : 2},
{ "_id": 3, "item" : 12, "price": 899.99, "name" : "Monitor EG880", "cust" : 2},
{ "_id": 4, "item" : 76, "price": 25.67, "name" : "Keyboard PW100", "cust" : 3}
])
db.customers.insert([
{ "_id": 1, "name" : "Jill Brand", "customer_no" : 1},
{ "_id": 2, "name" : "Jim Hill", "customer_no" : 2},
{ "_id": 3, "name" : "Joe Smith", "customer_no" : 3}
])
In the first example, we’ll calculate the average price, as a running average of all rows before the current row:
MySQL [test]> SELECT name, item, AVG(price) OVER(ORDER BY name) AS AVG_PRICE_ALIAS FROM sales;
+----------------+------+-----------------+
| name | item | AVG_PRICE_ALIAS |
+----------------+------+-----------------+
| Desktop UU4200 | 35 | 4068.23 |
| Keyboard PW100 | 76 | 2046.95 |
| Laptop MC2010 | 21 | 2038.1633333 |
| Monitor EG880 | 12 | 1753.62 |
+----------------+------+-----------------+
4 rows in set (0.06 sec)
Next, we reverse the order of the rows processed, changing the order to descending:
MySQL [test]> SELECT name, item, AVG(price) OVER(ORDER BY name DESC) AS AVG_PRICE_ALIAS FROM sales;
+----------------+------+-----------------+
| name | item | AVG_PRICE_ALIAS |
+----------------+------+-----------------+
| Monitor EG880 | 12 | 899.99 |
| Laptop MC2010 | 21 | 1460.29 |
| Keyboard PW100 | 76 | 982.08333333 |
| Desktop UU4200 | 35 | 1753.62 |
+----------------+------+-----------------+
4 rows in set (0.07 sec)
Next, lets add a PARTITION BY clause: each average is computed just for the sales of one customer:
MySQL [test]> SELECT name, cust, AVG(price) OVER(PARTITION BY cust ORDER BY name) AS AVG_PRICE_ALIAS FROM sales;
+----------------+------+-----------------+
| name | cust | AVG_PRICE_ALIAS |
+----------------+------+-----------------+
| Desktop UU4200 | 2 | 4068.23 |
| Monitor EG880 | 2 | 2484.11 |
| Laptop MC2010 | 1 | 2020.59 |
| Keyboard PW100 | 3 | 25.67 |
+----------------+------+-----------------+
4 rows in set (0.07 sec)
We’ll compute the average, considering only one preceding an one following row:
MySQL [test]> SELECT name, item, AVG(price) OVER(ORDER BY name ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING ) AS AVG_PRICE_ALIAS FROM sales;
+----------------+------+-----------------+
| name | item | AVG_PRICE_ALIAS |
+----------------+------+-----------------+
| Desktop UU4200 | 35 | 2046.95 |
| Keyboard PW100 | 76 | 2038.1633333 |
| Laptop MC2010 | 21 | 982.08333333 |
| Monitor EG880 | 12 | 1460.29 |
+----------------+------+-----------------+
4 rows in set (0.06 sec)
Now, a more complete example using a join and computing the standard deviation:
MySQL [test]> SELECT c.name,s.name, s.item, STD(s.price) OVER(ORDER BY c.name) AS STD_PRICE_ALIAS FROM sales s,customers c WHERE s.cust=c._id;
+------------+----------------+------+-----------------+
| name | name | item | STD_PRICE_ALIAS |
+------------+----------------+------+-----------------+
| Jill Brand | Laptop MC2010 | 21 | 0 |
| Jim Hill | Desktop UU4200 | 35 | 1311.7553391 |
| Jim Hill | Monitor EG880 | 12 | 1311.7553391 |
| Joe Smith | Keyboard PW100 | 76 | 1511.8852315 |
+------------+----------------+------+-----------------+
4 rows in set (0.11 sec)
Finally, lets look at the ranking of different sales, ordered by price:
MySQL [test]> SELECT name, cust, PERCENT_RANK() OVER(ORDER BY price) AS PERCENT_RANK FROM sales;
+----------------+------+-----------------+
| name | cust | PERCENT_RANK |
+----------------+------+-----------------+
| Keyboard PW100 | 3 | 0 |
| Monitor EG880 | 2 | 0.33333333333 |
| Laptop MC2010 | 1 | 0.66666666667 |
| Desktop UU4200 | 2 | 1 |
+----------------+------+-----------------+
4 rows in set (0.06 sec)
The syntax translatable in analytic_function is:
window_function:
analytic_function([ arguments ]) OVER (analytic_clause)
analytic_clause:
[ query_partition_clause ] [ order_by_clause [ windowing_clause ] ]
query_partition_clause:
PARTITION BY { expr[, expr ]... | ( expr[, expr ]... ) }
order_by_clause: ORDER BY { expr | position | c_alias } [ ASC | DESC ] [,...]
windowing_clause: { ROWS | RANGE }
{ BETWEEN
{ UNBOUNDED PRECEDING | CURRENT ROW | value_expr { PRECEDING | FOLLOWING } }
AND
{ UNBOUNDED FOLLOWING | CURRENT ROW | value_expr { PRECEDING | FOLLOWING } }
}
| { UNBOUNDED PRECEDING | CURRENT ROW | value_expr PRECEDING }
Analytic functions followed by an asterisk (*) allow the use of a windowing_clause. The windowing_clause is otherwise ignored.
SonarSQL supports the following analytic functions:
The syntax translatable in avg_analytic is:
avg_analytic:
AVG( expr) [ OVER(analytic_clause) ]
Computes the average value of expr. expr may be any expression of any numeric data type.
The function returns a value of type float.
The syntax translatable in min_analytic is:
min_analytic:
MIN( expr) [ OVER(analytic_clause) ]
Computes the minimum value of expr.
Returns the same type as the type of the expression expr.
The syntax translatable in max_analytic is:
max_analytic:
MAX( expr) [ OVER(analytic_clause) ]
Computes the maximum value of expr.
Returns the same type as the type of the expression expr.
The syntax translatable in stddev_analytic is:
stddev_analytic:
STDDEV( expr) [ OVER(analytic_clause) ]
Computes the standard deviation (population) value of expr.
The function returns a value of type float.
The syntax translatable in stddev_samp_analytic is:
stddev_samp_analytic:
STDDEV_SAMP( expr) [ OVER(analytic_clause) ]
Computes the standard deviation (sample) value of expr.
The function returns a value of type float.
The syntax translatable in stddev_pop_analytic is:
stddev_pop_analytic:
STDDEV_POP( expr) [ OVER(analytic_clause) ]
Computes the standard deviation (population) value of expr.
The function returns a value of type float.
The syntax translatable in variance_analytic is:
variance_analytic:
VARIANCE( expr) [ OVER(analytic_clause) ]
Computes the standard deviation (population) value of expr.
The function returns a value of type float.
The syntax translatable in var_samp_analytic is:
var_samp_analytic:
VAR_SAMP( expr) [ OVER(analytic_clause) ]
Computes the variance (sample) value of expr.
The function returns a value of type float.
The syntax translatable in var_pop_analytic is:
var_pop_analytic:
VAR_POP( expr) [ OVER(analytic_clause) ]
Computes the variance (population) value of expr.
The function returns a value of type float.
The syntax translatable in row_number_analytic is:
row_number_analytic:
ROW_NUMBER( )
OVER ([ query_partition_clause ] order_by_clause)
Assigns a unique number to each row, beginning with 1 and incrementing by one for each row.
The function returns a value of type int.
The syntax translatable in rank_analytic is:
rank_analytic:
RANK( )
OVER ([ query_partition_clause ] order_by_clause)
Computes the rank of each row within all rows within the window and sorted based on the ORDER BY clause.
The function returns a value of type float.
The syntax translatable in percent_rank_analytic is:
percent_rank_analytic:
PERCENT_RANK( )
OVER ([ query_partition_clause ] order_by_clause)
Returns a number between 0 and 1. The first row has a PERCENT_RANK of 0 and the last row 1.
The function returns a value of type float.
The syntax translatable in cume_dist_analytic is:
cume_dist_analytic:
CUME_DIST() OVER ([ query_partition_clause ] order_by_clause)
Calculates the cumulative distribution of a value in a group of values. Returns a number larger than 0 and smaller or equal to 1.
The function returns a value of type float.
The syntax translatable in group_concat_analytic is:
group_concat_analytic:
GROUP_CONCAT(measure_expr [, 'delimiter'])
WITHIN GROUP (order_by_clause) [OVER query_partition_clause]
Concatenate all values of the measure_expr in the window into a single list per row.
The type of the list elements is the same type as the original measure_expr.
Different data types are generally not converted implicitly. Some conversion can be explicitly done using CAST() or CONVERT().
BETWEEN ... AND ..., NOT BETWEEN ... AND ...
Arguments can be numbers, characters, strings, or datetime types. In the case of characters and strings, the comparison is case-sensitive.
COALESCE()
=
>=
>
IN(), NOT IN()
Can only be used in a condition in a WHERE or HAVING clause or in the SELECT list. In a condition, the argument may be an array or a subselect. When used in the SELECT list, argument must be an array.
IS NULL, IS NOT NULL, ISNULL()
<=
<
LIKE, NOT LIKE
Can only be used with string arguments.
!=, <>
CASE
IF()
IFNULL()
See general differences between MySQL and SonarSQL regarding NULL value handling.
ROUND()
TRUNCATE()
Both ROUND() and TRUNCATE work with expressions, but not literals. (SNRD-2376)
If a translated function does not require () in MySQL, then the unparenthesized version is also translated. Generally, only dates beyond January 1 1970 are supported. Milliseconds are generally unsupported.
Specifier | Description |
---|---|
%a | converted to %w; day of week, numeric (0=Sunday..6=Saturday) |
%b | converted to %m; month, numeric (00..12) |
%c | converted to %m; month, numeric (00..12) |
%D | converted to %d; day of month (00..31) |
%d | day of month (00..31) |
%e | converted to %d; day of month (00..31) |
%f | converted to returning milliseconds (000..999) |
%H | hour (00..23) |
%h | converted to %h; hour (00..23) |
%I | converted to %h; hour (00..23) |
%i | minutes (00..59) |
%j | day of year (001..366) |
%k | converted to %h; hour (00..23) |
%l | converted to %h; hour (00..23) |
%M | converted to %m; month, numeric (00..12) |
%m | month, numeric (00..12) |
%p | unsupported, empty string |
%r | converted to %T; time, 24-hour (hh:mm:ss) |
%S | seconds (00..59) |
%s | seconds (00..59) |
%T | time, 24-hour (hh:mm:ss) |
%U | week (00..53), where Sunday is the first day of the week |
%u | converted to %U; week (00..53), where Sunday is the first day of the week |
%V | converted to %U; week (00..53), where Sunday is the first day of the week |
%v | converted to %U; week (00..53), where Sunday is the first day of the week |
%W | converted to %w; day of week, numeric (0=Sunday..6=Saturday) |
%w | day of week, numeric (0=Sunday..6=Saturday) |
%X | unsupported, empty string |
%x | unsupported, empty string |
%Y | year, four digits |
%y | converted to %Y; year, four digits |
- CHAR without length
- DATE Returns datetime with 0 time part. Only basic string formats supported as input.
- DATETIME
- DECIMAL without lengths
- SIGNED, INTEGER, INT Rounds decimal places rather than truncating. Where MySQL returns 0 on input that cannot be converted, SonarSQL returns NULL.
- TIME Returns datetime with 1970-01-01 date part.
For these functions, the parentheses are required.
The information_schema database is a system database that allows users to gather information about the system.
SonarW does not have such a concept, but these tables are emulated by SonarSQL to some degree.