====================== SonarSQL Functionality ====================== .. contents:: :depth: 2 *SonarSQL* primarily translates *MySQL* ``SELECT`` statements. ``SELECT`` Syntax ================= *SonarSQL* can translate statements with the following syntax:: SELECT [DISTINCT] select_expr [, select_expr ...] [FROM table_references [WHERE where_condition] [GROUP BY {col_name | expr | position}, ...] [HAVING where_condition] [ORDER BY {col_name | position} [ASC | DESC], ...] [LIMIT {[offset,] row_count | row_count OFFSET offset}]] select_expr ----------- Using ``*`` is generally supported. However, mixing ``*`` with other items is not supported even if it is qualified e.g. ``tblname.*``. It is also not supported in statements involving ``UNION``. ``[AS] alias`` is generally supported. Using an existing column name as an alias for a different column is unsupported. FROM table_references --------------------- Selecting from ``DUAL`` is supported. Table names can have an alias specified:: tbl_name [[AS] alias] Qualified table names (with a database name) are unsupported. The qualifier is simply ignored and the default/current database will be assumed. A single table subquery is supported, however the statement cannot involve ``UNION`` syntax, or any joins. See also `JOIN Syntax`_. ``GROUP BY`` and ``ORDER BY`` ============================= Column name, column aliases, and column positions are generally supported. While column positions can be used when ``*`` is specified in the select list, the order of columns are returned is alphabetical rather than schema-defined. Columns from a dimension table (see `JOIN Syntax`_) may appear in both ``GROUP BY`` and ``ORDER BY``. If ``GROUP BY`` is used without ``ORDER BY``, the output will not be sorted according to columns in ``GROUP BY``. ``JOIN`` Syntax =============== The syntax translatable in ``table_references`` is:: table_references: table_factor | table_factor, table_factor | table_factor [INNER | CROSS] JOIN table_factor [ON conditional_expr] | table_factor LEFT [OUTER] JOIN table_factor [ON conditional_expr] | (subquery) alias table_factor: table_name [[AS alias]] When there is a join, *SonarSQL* automatically determines the fact table and the dimension table. Joins often are between a large table (fact table/collection), and a small lookup table (dimension table/collection). Whichever collection has more documents will be considered the fact table. *SonarSQL* attempts to determine the table to which an unqualified column belongs. However, it is strongly recommended that all columns whether from the fact or dimension table be qualified throughout a query involving a join. If an alias is defined for the table, the column must be qualified using the alias rather than the table name. The underlying collection in *SonarW* that corresponds to a dimension table cannot contain any arrays. Unrestricted cross joins are not supported, i.e. at least one join condition must be specified, whether using ``JOIN...ON`` syntax or in the ``WHERE`` clause. Each side of the condition may be simply a field or an expression involving fields from the same table. The comparison operator must be one of ``=``, ``<``, ``<=``, ``>``, ``>=``, or ``BETWEEN``. Multiple conditions may be specified with the use of ``AND``. A single subquery is supported in the ``FROM`` clause. Supported usage of columns from the dimension table in various clauses (``GROUP``, ``HAVING``, ``ORDER``, ``LIMIT``) is basic. .. what can we actually support in the various clauses? ``UNION`` Syntax ================ Supported in the same syntax as in *MySQL*. As opposed to *MySQL*, with *SonarSQL*, a statement cannot involve both ``UNION ALL`` and ``UNION DISTINCT``. Specifying a single ``UNION DISTINCT`` will apply distinct to the entire union. ``INSERT...SELECT`` Syntax ========================== *SonarSQL* can create a new collection populated by a ``SELECT`` statement:: INSERT [INTO] tbl_name SELECT ... In the ``SELECT`` statement here, ``UNION`` is not supported. Window Functions ================ Window functions provide the ability to perform calculations across sets of documents that are related to the current document. This is comparable to the type of calculation that can be done with an aggregate function. But unlike regular aggregate functions, use of a window function does not cause documents to become grouped into a single output document — the documents retain their separate identities. Behind the scenes, the window function is able to access more than just the current document. Window Function Examples ------------------------ The following examples make use of this sample database. Insert these json rows into the SonarW database directly, prior to issuing sonarSQL queries:: db.sales.insert([ { "_id": 1, "item" : 21, "price": 2020.59, "name" : "Laptop MC2010", "cust" : 1}, { "_id": 2, "item" : 35, "price": 4068.23, "name" : "Desktop UU4200", "cust" : 2}, { "_id": 3, "item" : 12, "price": 899.99, "name" : "Monitor EG880", "cust" : 2}, { "_id": 4, "item" : 76, "price": 25.67, "name" : "Keyboard PW100", "cust" : 3} ]) db.customers.insert([ { "_id": 1, "name" : "Jill Brand", "customer_no" : 1}, { "_id": 2, "name" : "Jim Hill", "customer_no" : 2}, { "_id": 3, "name" : "Joe Smith", "customer_no" : 3} ]) In the first example, we'll calculate the average price, as a running average of all rows before the current row:: MySQL [test]> SELECT name, item, AVG(price) OVER(ORDER BY name) AS AVG_PRICE_ALIAS FROM sales; +----------------+------+-----------------+ | name | item | AVG_PRICE_ALIAS | +----------------+------+-----------------+ | Desktop UU4200 | 35 | 4068.23 | | Keyboard PW100 | 76 | 2046.95 | | Laptop MC2010 | 21 | 2038.1633333 | | Monitor EG880 | 12 | 1753.62 | +----------------+------+-----------------+ 4 rows in set (0.06 sec) Next, we reverse the order of the rows processed, changing the order to descending:: MySQL [test]> SELECT name, item, AVG(price) OVER(ORDER BY name DESC) AS AVG_PRICE_ALIAS FROM sales; +----------------+------+-----------------+ | name | item | AVG_PRICE_ALIAS | +----------------+------+-----------------+ | Monitor EG880 | 12 | 899.99 | | Laptop MC2010 | 21 | 1460.29 | | Keyboard PW100 | 76 | 982.08333333 | | Desktop UU4200 | 35 | 1753.62 | +----------------+------+-----------------+ 4 rows in set (0.07 sec) Next, lets add a PARTITION BY clause: each average is computed just for the sales of one customer:: MySQL [test]> SELECT name, cust, AVG(price) OVER(PARTITION BY cust ORDER BY name) AS AVG_PRICE_ALIAS FROM sales; +----------------+------+-----------------+ | name | cust | AVG_PRICE_ALIAS | +----------------+------+-----------------+ | Desktop UU4200 | 2 | 4068.23 | | Monitor EG880 | 2 | 2484.11 | | Laptop MC2010 | 1 | 2020.59 | | Keyboard PW100 | 3 | 25.67 | +----------------+------+-----------------+ 4 rows in set (0.07 sec) We'll compute the average, considering only one preceding an one following row:: MySQL [test]> SELECT name, item, AVG(price) OVER(ORDER BY name ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING ) AS AVG_PRICE_ALIAS FROM sales; +----------------+------+-----------------+ | name | item | AVG_PRICE_ALIAS | +----------------+------+-----------------+ | Desktop UU4200 | 35 | 2046.95 | | Keyboard PW100 | 76 | 2038.1633333 | | Laptop MC2010 | 21 | 982.08333333 | | Monitor EG880 | 12 | 1460.29 | +----------------+------+-----------------+ 4 rows in set (0.06 sec) Now, a more complete example using a join and computing the standard deviation:: MySQL [test]> SELECT c.name,s.name, s.item, STD(s.price) OVER(ORDER BY c.name) AS STD_PRICE_ALIAS FROM sales s,customers c WHERE s.cust=c._id; +------------+----------------+------+-----------------+ | name | name | item | STD_PRICE_ALIAS | +------------+----------------+------+-----------------+ | Jill Brand | Laptop MC2010 | 21 | 0 | | Jim Hill | Desktop UU4200 | 35 | 1311.7553391 | | Jim Hill | Monitor EG880 | 12 | 1311.7553391 | | Joe Smith | Keyboard PW100 | 76 | 1511.8852315 | +------------+----------------+------+-----------------+ 4 rows in set (0.11 sec) Finally, lets look at the ranking of different sales, ordered by price:: MySQL [test]> SELECT name, cust, PERCENT_RANK() OVER(ORDER BY price) AS PERCENT_RANK FROM sales; +----------------+------+-----------------+ | name | cust | PERCENT_RANK | +----------------+------+-----------------+ | Keyboard PW100 | 3 | 0 | | Monitor EG880 | 2 | 0.33333333333 | | Laptop MC2010 | 1 | 0.66666666667 | | Desktop UU4200 | 2 | 1 | +----------------+------+-----------------+ 4 rows in set (0.06 sec) Window Function Syntax ---------------------- The syntax translatable in ``analytic_function`` is:: window_function: analytic_function([ arguments ]) OVER (analytic_clause) analytic_clause: [ query_partition_clause ] [ order_by_clause [ windowing_clause ] ] query_partition_clause: PARTITION BY { expr[, expr ]... | ( expr[, expr ]... ) } order_by_clause: ORDER BY { expr | position | c_alias } [ ASC | DESC ] [,...] windowing_clause: { ROWS | RANGE } { BETWEEN { UNBOUNDED PRECEDING | CURRENT ROW | value_expr { PRECEDING | FOLLOWING } } AND { UNBOUNDED FOLLOWING | CURRENT ROW | value_expr { PRECEDING | FOLLOWING } } } | { UNBOUNDED PRECEDING | CURRENT ROW | value_expr PRECEDING } Analytic Functions ------------------ Analytic functions followed by an asterisk (``*``) allow the use of a ``windowing_clause``. The ``windowing_clause`` is otherwise ignored. *SonarSQL* supports the following analytic functions: AVG* ~~~~ The syntax translatable in ``avg_analytic`` is:: avg_analytic: AVG( expr) [ OVER(analytic_clause) ] Computes the average value of expr. expr may be any expression of any numeric data type. The function returns a value of type float. MIN* ~~~~ The syntax translatable in ``min_analytic`` is:: min_analytic: MIN( expr) [ OVER(analytic_clause) ] Computes the minimum value of expr. Returns the same type as the type of the expression expr. MAX* ~~~~ The syntax translatable in ``max_analytic`` is:: max_analytic: MAX( expr) [ OVER(analytic_clause) ] Computes the maximum value of expr. Returns the same type as the type of the expression expr. STDDEV* ~~~~~~~ The syntax translatable in ``stddev_analytic`` is:: stddev_analytic: STDDEV( expr) [ OVER(analytic_clause) ] Computes the standard deviation (population) value of expr. The function returns a value of type float. STDDEV_SAMP* ~~~~~~~~~~~~ The syntax translatable in ``stddev_samp_analytic`` is:: stddev_samp_analytic: STDDEV_SAMP( expr) [ OVER(analytic_clause) ] Computes the standard deviation (sample) value of expr. The function returns a value of type float. STDDEV_POP* ~~~~~~~~~~~ The syntax translatable in ``stddev_pop_analytic`` is:: stddev_pop_analytic: STDDEV_POP( expr) [ OVER(analytic_clause) ] Computes the standard deviation (population) value of expr. The function returns a value of type float. VARIANCE* ~~~~~~~~~ The syntax translatable in ``variance_analytic`` is:: variance_analytic: VARIANCE( expr) [ OVER(analytic_clause) ] Computes the standard deviation (population) value of expr. The function returns a value of type float. VAR_SAMP* ~~~~~~~~~ The syntax translatable in ``var_samp_analytic`` is:: var_samp_analytic: VAR_SAMP( expr) [ OVER(analytic_clause) ] Computes the variance (sample) value of expr. The function returns a value of type float. VAR_POP* ~~~~~~~~ The syntax translatable in ``var_pop_analytic`` is:: var_pop_analytic: VAR_POP( expr) [ OVER(analytic_clause) ] Computes the variance (population) value of expr. The function returns a value of type float. ROW_NUMBER ~~~~~~~~~~ The syntax translatable in ``row_number_analytic`` is:: row_number_analytic: ROW_NUMBER( ) OVER ([ query_partition_clause ] order_by_clause) Assigns a unique number to each row, beginning with 1 and incrementing by one for each row. The function returns a value of type int. RANK ~~~~ The syntax translatable in ``rank_analytic`` is:: rank_analytic: RANK( ) OVER ([ query_partition_clause ] order_by_clause) Computes the rank of each row within all rows within the window and sorted based on the ``ORDER BY`` clause. The function returns a value of type float. PERCENT_RANK ~~~~~~~~~~~~ The syntax translatable in ``percent_rank_analytic`` is:: percent_rank_analytic: PERCENT_RANK( ) OVER ([ query_partition_clause ] order_by_clause) Returns a number between 0 and 1. The first row has a ``PERCENT_RANK`` of 0 and the last row 1. The function returns a value of type float. CUME_DIST ~~~~~~~~~ The syntax translatable in ``cume_dist_analytic`` is:: cume_dist_analytic: CUME_DIST() OVER ([ query_partition_clause ] order_by_clause) Calculates the cumulative distribution of a value in a group of values. Returns a number larger than 0 and smaller or equal to 1. The function returns a value of type float. GROUP_CONCAT ~~~~~~~~~~~~ The syntax translatable in ``group_concat_analytic`` is:: group_concat_analytic: GROUP_CONCAT(measure_expr [, 'delimiter']) WITHIN GROUP (order_by_clause) [OVER query_partition_clause] Concatenate all values of the ``measure_expr`` in the window into a single list per row. The type of the list elements is the same type as the original measure_expr. Limitations ----------- *SonarSQL* supports basic window functions with the following restrictions and limitations: * At most one window function per query or sub-query. * A window function must appear in the ``SELECT`` clause. * Window functions follow ``GROUP`` and ``HAVING`` and preceed ``ORDER BY``. * A sub query is necessary in order to filter on the result of a window function. * *SonarSQL* does not support an ``ORDER BY`` containing either a ``NULLS FIRST`` or ``NULLS LAST`` clause. * *SonarSQL* does not support an ``ORDER BY`` containing a ``SIBLINGS`` clause. * *SonarSQL* does not support the use of ``DISTINCT`` or ``ALL`` in any analytic function. Functions and Operators ======================= Type Conversion in Expression Evaluation ---------------------------------------- Different data types are generally not converted implicitly. Some conversion can be explicitly done using ``CAST()`` or ``CONVERT()``. .. specify the conversions that are/aren't done? and what to do instead? Comparison Functions and Operators ---------------------------------- * ``BETWEEN ... AND ...``, ``NOT BETWEEN ... AND ...`` Arguments can be numbers, characters, strings, or datetime types. In the case of characters and strings, the comparison is case-sensitive. * ``COALESCE()`` * ``=`` * ``>=`` * ``>`` * ``IN()``, ``NOT IN()`` Can only be used in a condition in a ``WHERE`` or ``HAVING`` clause or in the ``SELECT`` list. In a condition, the argument may be an array or a subselect. When used in the ``SELECT`` list, argument must be an array. * ``IS NULL``, ``IS NOT NULL``, ``ISNULL()`` * ``<=`` * ``<`` * ``LIKE``, ``NOT LIKE`` Can only be used with string arguments. * ``!=``, ``<>`` Logical Operators ----------------- * ``AND``, ``&&`` * ``OR``, ``||`` .. this is too thinly supported * ``NOT``, ``!`` .. where it is used with operators, it's usually explicitly listed Control Flow Functions ---------------------- * ``CASE`` * ``IF()`` * ``IFNULL()`` See general differences between *MySQL* and *SonarSQL* regarding ``NULL`` value handling. String Functions ---------------- * ``CONCAT()`` Arguments can be strings or numbers. * ``INSTR()`` * ``LEFT()`` If any argument is ``NULL`` returns empty string. * ``LENGTH()``, ``OCTET_LENGTH()`` Argument can be a string. * ``LOCATE()``, ``POSITION()`` * ``LOWER()``, ``LCASE()`` * ``LTRIM()`` * ``RIGHT()`` If any argument is ``NULL``, returns empty string. * ``RTRIM()`` * ``SUBSTRING()``, ``MID()``, ``SUBSTR()`` Position argument must be > 0. * ``TRIM()`` Only translates stripping whitespace, ignores ``remstr`` if specified. * ``UPPER()``, ``UCASE()`` Numeric Functions and Operators ------------------------------- * ``/`` * ``-`` (binary and unary) * ``%``, ``MOD``, ``MOD()`` * ``+`` * ``*`` (multiplication) * ``PI()`` .. something about how many digits of PI we get? * ``ROUND()`` * ``TRUNCATE()`` Both ``ROUND()`` and ``TRUNCATE`` work with expressions, but not literals. (`SNRD-2376`) Date functions -------------- If a translated function does not require ``()`` in *MySQL*, then the unparenthesized version is also translated. Generally, only dates beyond January 1 1970 are supported. Milliseconds are generally unsupported. * ``CURDATE()``, ``CURRENT_DATE()`` Returns string 'YYYY-MM-DD' * ``CURTIME()``, ``CURRENT_TIME()`` Returns string 'HH:MM:SS' * ``DATE()`` Returns string 'YYYY-MM-DD' if used in select list at the top level with a non-literal argument, otherwise an internal datetime with 00:00:00 time * ``DATE_DIFF()`` * ``DATE_ADD()``, ``DATE_SUB()``, ``ADDDATE()``, ``SUBDATE()`` Returns string 'YYYY-MM-DD HH:MM:SS'. Supported for units: + ``SECOND`` + ``MINUTE`` + ``HOUR`` + ``DAY`` + ``WEEK`` + ``MONTH`` + ``QUARTER`` + ``YEAR`` + ``YEAR_MONTH`` interval given as 'YEARS-MONTHS' * ``DATE_FORMAT()`` Format specifier characters ========= =========== Specifier Description ========= =========== %a converted to %w; day of week, numeric (0=Sunday..6=Saturday) %b converted to %m; month, numeric (00..12) %c converted to %m; month, numeric (00..12) %D converted to %d; day of month (00..31) %d day of month (00..31) %e converted to %d; day of month (00..31) %f converted to returning milliseconds (000..999) %H hour (00..23) %h converted to %h; hour (00..23) %I converted to %h; hour (00..23) %i minutes (00..59) %j day of year (001..366) %k converted to %h; hour (00..23) %l converted to %h; hour (00..23) %M converted to %m; month, numeric (00..12) %m month, numeric (00..12) %p unsupported, empty string %r converted to %T; time, 24-hour (hh:mm:ss) %S seconds (00..59) %s seconds (00..59) %T time, 24-hour (hh:mm:ss) %U week (00..53), where Sunday is the first day of the week %u converted to %U; week (00..53), where Sunday is the first day of the week %V converted to %U; week (00..53), where Sunday is the first day of the week %v converted to %U; week (00..53), where Sunday is the first day of the week %W converted to %w; day of week, numeric (0=Sunday..6=Saturday) %w day of week, numeric (0=Sunday..6=Saturday) %X unsupported, empty string %x unsupported, empty string %Y year, four digits %y converted to %Y; year, four digits ========= =========== .. The above table is duplicated in mysql.rst - changes need to be duplicated * ``DAYOFMONTH()``, ``DAY()`` If day part of date is 0, returns 1 * ``DAYOFWEEK()`` * ``DAYOFYEAR()`` * ``EXTRACT()`` Supported for units: + ``YEAR`` + ``QUARTER`` + ``MONTH`` + ``WEEK`` + ``DAY`` + ``HOUR`` + ``MINUTE`` + ``SECOND`` + ``YEAR_MONTH`` + ``DAY_HOUR`` + ``DAY_MINUTE`` + ``DAY_SECOND`` + ``HOUR_MINUTE`` + ``HOUR_SECOND`` + ``MINUTE_SECOND`` * ``HOUR()`` Input is a datetime * ``MINUTE()`` Input is a datetime * ``MONTH()`` If month part of date is 0, returns 1 * ``NOW()``, ``CURRENT_TIMESTAMP()``, ``LOCALTIME()``, ``LOCALTIMESTAMP()`` Returns string 'YYYY-MM-DD HH:MM:SS'. * ``QUARTER()`` If month part of date is 0, returns 1 * ``SECOND()`` Input is a datetime * ``STR_TO_DATE()`` Only handles date formats, not time; returns ISODate displayed as string 'YYYY-MM-DD 00:00:00'. * ``TIMESTAMPADD()`` Returns string 'YYYY-MM-DD HH:MM:SS'. Supported for units: + ``SECOND`` + ``MINUTE`` + ``HOUR`` + ``DAY`` + ``WEEK`` + ``MONTH`` + ``QUARTER`` + ``YEAR`` * ``TIME()`` Returns string 'HH:MM:SS' if used in select list at the top level with a non-literal argument, otherwise an internal datetime with 1970-01-01 date part * ``TIMEDIFF()`` * ``TIMESTAMPDIFF()`` Supported for same units as ``TIMESTAMPADD()`` * ``UTC_DATE()`` Returns string 'YYYY-MM-DD' * ``UTC_TIME()`` Returns string 'HH:MM:SS' * ``UTC_TIMESTAMP`` Returns string 'YYYY-MM-DD HH:MM:SS' * ``WEEK()`` Functions in mode 0, does not accept mode argument * ``YEAR()`` Cast functions -------------- * ``CAST(expr AS type)``, ``CONVERT(expr, type)`` Supported types: + ``CHAR`` without length + ``DATE`` Returns datetime with 0 time part. Only basic string formats supported as input. + ``DATETIME`` + ``DECIMAL`` without lengths + ``SIGNED``, ``INTEGER``, ``INT`` Rounds decimal places rather than truncating. Where *MySQL* returns 0 on input that cannot be converted, *SonarSQL* returns ``NULL``. + ``TIME`` Returns datetime with 1970-01-01 date part. .. Crude notes on translation of each allowed `type`: which we translate using $toNumber, $toBool, $toDate, $toInt, $toLong, $toString * BINARY[(N)] - no sonar equivalent; not translating * CHAR[(N)] - translated using $toString fine; not handling (N) * DATE - translated using $toDate, however the sonar output is a datetime with 0:00:00 time, and the behaviour of CAST(expr AS DATE) is just plain different from $toDate * DATETIME - translated using $toDateTime, same issues as DATE * DECIMAL[(M[,D])] - translated using $toNumber; not handling M or D parameters * SIGNED [INTEGER], INTEGER, INT - translated using $toInt; sonar returns null on inputs that mysql would return 0 * TIME - no sonar equivalent (at time this was written); not translating * UNSIGNED [INTEGER] - no sonar equivalent; not translating Information Functions --------------------- For these functions, the parentheses are required. * ``CONNECTION_ID()`` * ``CURRENT_USER()`` Returns username only * ``DATABASE()``, ``SCHEMA()`` * ``USER()``, ``SESSION_USER(), ``SYSTEM_USER()`` Behaves the same as ``CURRENT_USER()`` * ``VERSION()`` Aggregate Functions ------------------- * ``AVG()`` * ``COUNT()`` * ``COUNT(DISTINCT)`` * ``MAX()`` * ``MIN()`` * ``SUM()`` Miscellaneous Functions ----------------------- * ``EXPLAIN`` - show *SonarW* pipeline information per query. * ``SHOW TABLES`` * ``SHOW DATABASES`` * ``SHOW FIELDS`` * ``SHOW PROCESSLIST`` * ``CREATE USER`` Variables --------- * ``@@version_comment`` * ``@maxTimeMS`` - the maximum time allowed for a query in SonarW Information Schema ================== The ``information_schema`` database is a system database that allows users to gather information about the system. *SonarW* does not have such a concept, but these tables are emulated by *SonarSQL* to some degree.