druid.io - SQL









Search Preview

Druid |

druid.io

.io > druid.io

SEO audit: Content analysis

Language Error! No language localisation is found.
Title Druid |
Text / HTML ratio 67 %
Frame Excellent! The website does not use iFrame solutions.
Flash Excellent! The website does not have any flash contents.
Keywords cloud Druid query SQL queries time zone functions column expr timestamp types broker aggregation number = string clause GROUP table false
Keywords consistency
Keyword Content Title Description Headings
Druid 57
query 45
SQL 44
queries 38
time 36
zone 25
Headings
H1 H2 H3 H4 H5 H6
1 5 15 1 0 0
Images We found 0 images on this web page.

SEO Keywords (Single)

Keyword Occurrence Density
Druid 57 2.85 %
query 45 2.25 %
SQL 44 2.20 %
queries 38 1.90 %
time 36 1.80 %
zone 25 1.25 %
functions 21 1.05 %
column 20 1.00 %
expr 20 1.00 %
timestamp 19 0.95 %
types 18 0.90 %
broker 18 0.90 %
aggregation 16 0.80 %
number 16 0.80 %
= 15 0.75 %
string 15 0.75 %
clause 15 0.75 %
GROUP 14 0.70 %
table 14 0.70 %
false 14 0.70 %

SEO Keywords (Two Word)

Keyword Occurrence Density
time zone 25 1.25 %
can be 21 1.05 %
Druid SQL 16 0.80 %
in the 16 0.80 %
number of 14 0.70 %
will be 14 0.70 %
GROUP BY 14 0.70 %
as a 13 0.65 %
ORDER BY 13 0.65 %
on the 12 0.60 %
True if 12 0.60 %
if x 12 0.60 %
be used 10 0.50 %
the broker 10 0.50 %
a time 10 0.50 %
is not 10 0.50 %
x is 10 0.50 %
of the 10 0.50 %
a timestamp 9 0.45 %
Whether to 9 0.45 %

SEO Keywords (Three Word)

Keyword Occurrence Density Possible Spam
True if x 12 0.60 % No
if x is 10 0.50 % No
it as a 8 0.40 % No
a time zone 8 0.40 % No
returning it as 7 0.35 % No
on the broker 7 0.35 % No
like AmericaLos_Angeles or 7 0.35 % No
offset like 0800 7 0.35 % No
should be a 6 0.30 % No
or offset like 6 0.30 % No
AmericaLos_Angeles or offset 6 0.30 % No
name like AmericaLos_Angeles 6 0.30 % No
zone name like 6 0.30 % No
time zone name 6 0.30 % No
be a time 6 0.30 % No
provided should be 5 0.25 % No
can be used 5 0.25 % No
clause refers to 5 0.25 % No
as a new 5 0.25 % No
a new timestamp 5 0.25 % No

SEO Keywords (Four Word)

Keyword Occurrence Density Possible Spam
True if x is 10 0.50 % No
returning it as a 7 0.35 % No
time zone name like 6 0.30 % No
a time zone name 6 0.30 % No
be a time zone 6 0.30 % No
zone name like AmericaLos_Angeles 6 0.30 % No
name like AmericaLos_Angeles or 6 0.30 % No
like AmericaLos_Angeles or offset 6 0.30 % No
AmericaLos_Angeles or offset like 6 0.30 % No
or offset like 0800 6 0.30 % No
should be a time 5 0.25 % No
it as a new 5 0.25 % No
as a new timestamp 5 0.25 % No
provided should be a 5 0.25 % No
if provided should be 5 0.25 % No
zone if provided should 5 0.25 % No
time zone if provided 5 0.25 % No
The time zone if 5 0.25 % No
timestamp returning it as 4 0.20 % No
SECOND MINUTE HOUR DAY 4 0.20 % No

Druid.io Spined HTML


Druid | Technology Use Cases Powered By Docs Community Download MENU MENU Table of Contents SQLSeatedSQL is an experimental feature. The API described here is subject to change. Druid SQL is a seated SQL layer and an volitional to Druid's native JSON-based query language, and is powered by a parser and planner based on Apache Calcite. Druid SQL translates SQL into native Druid queries on the query usurer (the first node you query), which are then passed lanugo to data nodes as native Druid queries. Other than the (slight) overhead of translating SQL on the broker, there isn't an spare performance penalty versus native queries. To enable Druid SQL, make sure you have set druid.sql.enable = true either in your common.runtime.properties or your broker's runtime.properties. Query syntax Each Druid datasource appears as a table in the "druid" schema. This is moreover the default schema, so Druid datasources can be referenced as either druid.dataSourceName or simply dataSourceName. Identifiers like datasource and post names can optionally be quoted using double quotes. To escape a double quote inside an identifier, use flipside double quote, like "My ""very own"" identifier". All identifiers are case-sensitive and no implicit specimen conversions are performed. Literal strings should be quoted with single quotes, like 'foo'. Literal strings with Unicode escapes can be written like U&'fo\00F6', where weft codes in hex are prefixed by a backslash. Literal numbers can be written in forms like 100 (denoting an integer), 100.0 (denoting a floating point value), or 1.0e5 (scientific notation). Literal timestamps can be written like TIMESTAMP '2000-01-01 00:00:00'. Literal intervals, used for time arithmetic, can be written like INTERVAL '1' HOUR, INTERVAL '1 02:03' DAY TO MINUTE, INTERVAL '1-2' YEAR TO MONTH, and so on. Druid SQL supports SELECT queries with the pursuit structure: [ EXPLAIN PLAN FOR ] [ WITH tableName [ ( column1, column2, ... ) ] AS ( query ) ] SELECT [ ALL | DISTINCT ] { * | exprs } FROM table [ WHERE expr ] [ GROUP BY exprs ] [ HAVING expr ] [ ORDER BY expr [ ASC | DESC ], expr [ ASC | DESC ], ... ] [ LIMIT limit ] The FROM clause refers to either a Druid datasource, like druid.foo, an INFORMATION_SCHEMA table, a subquery, or a common-table-expression provided in the WITH clause. If the FROM clause references a subquery or a common-table-expression, and both levels of queries are aggregations and they cannot be combined into a single level of aggregation, the overall query will be executed as a nested GroupBy. The WHERE clause refers to columns in the FROM table, and will be translated to native filters. The WHERE clause can moreover reference a subquery, like WHERE col1 IN (SELECT foo FROM ...). Queries like this are executed as semi-joins, described below. The GROUP BY clause refers to columns in the FROM table. Using GROUP BY, DISTINCT, or any team functions will trigger an team query using one of Druid's three native team query types. GROUP BY can refer to an expression or a select clause ordinal position (like GROUP BY 2 to group by the second selected column). The HAVING clause refers to columns that are present without execution of GROUP BY. It can be used to filter on either grouping expressions or aggregated values. It can only be used together with GROUP BY. The ORDER BY clause refers to columns that are present without execution of GROUP BY. It can be used to order the results based on either grouping expressions or aggregated values. ORDER BY can refer to an expression or a select clause ordinal position (like ORDER BY 2 to order by the second selected column). For non-aggregation queries, ORDER BY can only order by the __time column. For team queries, ORDER BY can order by any column. The LIMIT clause can be used to limit the number of rows returned. It can be used with any query type. It is pushed lanugo to data nodes for queries that run with the native TopN query type, but not the native GroupBy query type. Future versions of Druid will support pushing lanugo limits using the native GroupBy query type as well. If you notice that subtracting a limit doesn't transpiration performance very much, then it's likely that Druid didn't push lanugo the limit for your query. Add "EXPLAIN PLAN FOR" to the whence of any query to see how it would be run as a native Druid query. In this case, the query will not unquestionably be executed.TeamfunctionsTeamfunctions can towards in the SELECT clause of any query. Any aggregator can be filtered using syntax like AGG(expr) FILTER(WHERE whereExpr). Filtered aggregators will only volume rows that match their filter. It's possible for two aggregators in the same SQL query to have variegated filters. Only the COUNT team can winnow DISTINCT. Function Notes COUNT(*) Counts the number of rows. COUNT(DISTINCT expr) Counts unshared values of expr, which can be string, numeric, or hyperUnique. By default this is approximate, using a variant of HyperLogLog. To get word-for-word counts set "useApproximateCountDistinct" to "false". If you do this, expr must be string or numeric, since word-for-word counts are not possible using hyperUnique columns. See moreover APPROX_COUNT_DISTINCT(expr). In word-for-word mode, only one unshared count per query is permitted. SUM(expr) Sums numbers. MIN(expr) Takes the minimum of numbers. MAX(expr) Takes the maximum of numbers. AVG(expr) Averages numbers. APPROX_COUNT_DISTINCT(expr) Counts unshared values of expr, which can be a regular post or a hyperUnique column. This is unchangingly approximate, regardless of the value of "useApproximateCountDistinct". See moreover COUNT(DISTINCT expr). APPROX_QUANTILE(expr, probability, [resolution]) Computes injudicious quantiles on numeric or approxHistogram exprs. The "probability" should be between 0 and 1 (exclusive). The "resolution" is the number of centroids to use for the computation. Higher resolutions will requite increasingly precise results but moreover have higher overhead. If not provided, the default resolution is 50. The injudicious histogram extension must be loaded to use this function. Numeric functions Numeric functions will return 64 bit integers or 64 bit floats, depending on their inputs. Function Notes ABS(expr) Absolute value. CEIL(expr) Ceiling. EXP(expr) e to the power of expr. FLOOR(expr) Floor. LN(expr) Logarithm (base e). LOG10(expr) Logarithm (base 10). POWER(expr, power) expr to a power. SQRT(expr) Square root. TRUNCATE(expr[, digits]) Truncate expr to a specific number of decimal digits. If digits is negative, then this truncates that many places to the left of the decimal point. Digits defaults to zero if not specified. TRUNC(expr[, digits]) Synonym for TRUNCATE. x + y Addition. x - y Subtraction. x * y Multiplication. x / y Division. MOD(x, y) Modulo (remainder of x divided by y). String functions String functions winnow strings, and return a type towardly to the function. Function Notes `x \ \ LENGTH(expr) Length of expr in UTF-16 lawmaking units. CHAR_LENGTH(expr) Synonym for LENGTH. CHARACTER_LENGTH(expr) Synonym for LENGTH. STRLEN(expr) Synonym for LENGTH. LOOKUP(expr, lookupName) Look up expr in a registered query-time lookup table. LOWER(expr) Returns expr in all lowercase. REGEXP_EXTRACT(expr, pattern, [index]) Apply regular expression pattern and pericope a capture group, or null if there is no match. If alphabetize is unspecified or zero, returns the substring that matched the pattern. REPLACE(expr, pattern, replacement) Replaces pattern with replacement in expr, and returns the result. STRPOS(haystack, needle) Returns the alphabetize of needle within haystack, starting from 1. If the needle is not found, returns 0. SUBSTRING(expr, index, [length]) Returns a substring of expr starting at index, with a max length, both measured in UTF-16 lawmaking units. SUBSTR(expr, index, [length]) Synonym for SUBSTRING. `TRIM([BOTH \ LEADING \ BTRIM(expr[, chars]) Alternate form of TRIM(BOTH <chars> FROM <expr>). LTRIM(expr[, chars]) Alternate form of TRIM(LEADING <chars> FROM <expr>). RTRIM(expr[, chars]) Alternate form of TRIM(TRAILING <chars> FROM <expr>). UPPER(expr) Returns expr in all uppercase. Time functions Time functions can be used with Druid's __time column, with any post storing millisecond timestamps through use of the MILLIS_TO_TIMESTAMP function, or with any post storing string timestamps through use of the TIME_PARSE function. By default, time operations use the UTC time zone. You can transpiration the time zone by setting the connection context parameter "sqlTimeZone" to the name of flipside time zone, like "America/Los_Angeles", or to an offset like "-08:00". If you need to mix multiple time zones in the same query, or if you need to use a time zone other than the connection time zone, some functions moreover winnow time zones as parameters. These parameters unchangingly take precedence over the connection time zone. Function Notes CURRENT_TIMESTAMP Current timestamp in the connection's time zone. CURRENT_DATE Current stage in the connection's time zone. DATE_TRUNC(<unit>, <timestamp_expr>) Rounds lanugo a timestamp, returning it as a new timestamp. Unit can be 'milliseconds', 'second', 'minute', 'hour', 'day', 'week', 'month', 'quarter', 'year', 'decade', 'century', or 'millenium'. TIME_FLOOR(<timestamp_expr>, <period>, [<origin>, [<timezone>]]) Rounds lanugo a timestamp, returning it as a new timestamp. Period can be any ISO8601 period, like P3M (quarters) or PT12H (half-days). The time zone, if provided, should be a time zone name like "America/Los_Angeles" or offset like "-08:00". This function is similar to FLOOR but is increasingly flexible. TIME_SHIFT(<timestamp_expr>, <period>, <step>, [<timezone>]) Shifts a timestamp by a period (step times), returning it as a new timestamp. Period can be any ISO8601 period. Step may be negative. The time zone, if provided, should be a time zone name like "America/Los_Angeles" or offset like "-08:00". TIME_EXTRACT(<timestamp_expr>, [<unit>, [<timezone>]]) Extracts a time part from expr, returning it as a number. Unit can be EPOCH, SECOND, MINUTE, HOUR, DAY (day of month), DOW (day of week), DOY (day of year), WEEK (week of week year), MONTH (1 through 12), QUARTER (1 through 4), or YEAR. The time zone, if provided, should be a time zone name like "America/Los_Angeles" or offset like "-08:00". This function is similar to EXTRACT but is increasingly flexible. Unit and time zone must be literals, and must be provided quoted, like TIME_EXTRACT(__time, 'HOUR') or TIME_EXTRACT(__time, 'HOUR', 'America/Los_Angeles'). TIME_PARSE(<string_expr>, [<pattern>, [<timezone>]]) Parses a string into a timestamp using a given Joda DateTimeFormat pattern, or ISO8601 (e.g. 2000-01-02T03:04:05Z) if the pattern is not provided. The time zone, if provided, should be a time zone name like "America/Los_Angeles" or offset like "-08:00", and will be used as the time zone for strings that do not include a time zone offset. Pattern and time zone must be literals. Strings that cannot be parsed as timestamps will be returned as NULL. TIME_FORMAT(<timestamp_expr>, [<pattern>, [<timezone>]]) Formats a timestamp as a string with a given Joda DateTimeFormat pattern, or ISO8601 (e.g. 2000-01-02T03:04:05Z) if the pattern is not provided. The time zone, if provided, should be a time zone name like "America/Los_Angeles" or offset like "-08:00". Pattern and time zone must be literals. MILLIS_TO_TIMESTAMP(millis_expr) Converts a number of milliseconds since the epoch into a timestamp. TIMESTAMP_TO_MILLIS(timestamp_expr) Converts a timestamp into a number of milliseconds since the epoch. EXTRACT(<unit> FROM timestamp_expr) Extracts a time part from expr, returning it as a number. Unit can be EPOCH, SECOND, MINUTE, HOUR, DAY (day of month), DOW (day of week), DOY (day of year), WEEK (week of year), MONTH, QUARTER, or YEAR. Units must be provided unquoted, like EXTRACT(HOUR FROM __time). FLOOR(timestamp_expr TO <unit>) Rounds lanugo a timestamp, returning it as a new timestamp. Unit can be SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, or YEAR. CEIL(timestamp_expr TO <unit>) Rounds up a timestamp, returning it as a new timestamp. Unit can be SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, or YEAR. TIMESTAMPADD(<unit>, <count>, <timestamp>) Equivalent to timestamp + count * INTERVAL '1' UNIT. `timestamp_expr { + \ - } ` Comparison operators Function Notes x = y Equals. x <> y Not-equals. x > y Greater than. x >= y Greater than or equal to. x < y Less than. x <= y Less than or equal to. x BETWEEN y AND z Equivalent to x >= y AND x <= z. x NOT BETWEEN y AND z Equivalent to x < y OR x > z. x LIKE pattern [ESCAPE esc] True if x matches a SQL LIKE pattern (with an optional escape). x NOT LIKE pattern [ESCAPE esc] True if x does not match a SQL LIKE pattern (with an optional escape). x IS NULL True if x is NULL or empty string. x IS NOT NULL True if x is neither NULL nor empty string. x IS TRUE True if x is true. x IS NOT TRUE True if x is not true. x IS FALSE True if x is false. x IS NOT FALSE True if x is not false. x IN (values) True if x is one of the listed values. x NOT IN (values) True if x is not one of the listed values. x IN (subquery) True if x is returned by the subquery. See Syntax and execution whilom for details well-nigh how Druid SQL handles IN (subquery). x NOT IN (subquery) True if x is not returned by the subquery. See Syntax and execution for details well-nigh how Druid SQL handles IN (subquery). x AND y Boolean AND. x OR y Boolean OR. NOT x Boolean NOT. Other functions Function Notes CAST(value AS TYPE)Tintvalue to flipside type. See Data types and casts for details well-nigh how Druid SQL handles CAST. CASE expr WHEN value1 THEN result1 \[ WHEN value2 THEN result2 ... \] \[ ELSE resultN \] END Simple CASE. CASE WHEN boolean_expr1 THEN result1 \[ WHEN boolean_expr2 THEN result2 ... \] \[ ELSE resultN \] END Searched CASE. NULLIF(value1, value2) Returns NULL if value1 and value2 match, else returns value1. COALESCE(value1, value2, ...) Returns the first value that is neither NULL nor empty string. Unsupported features Druid does not support all SQL features, including: OVER clauses, and supersensual functions such as LAG and LEAD. JOIN clauses, other than semi-joins as described above. OFFSET clauses. DDL and DML. Additionally, some Druid features are not supported by the SQL language. Some unsupported Druid features include: Multi-value dimensions. DataSketches aggregators. Data types and casts Druid natively supports five vital post types: "long" (64 bit signed int), "float" (32 bit float), "double" (64 bit float) "string" (UTF-8 encoded strings), and "complex" (catch-all for increasingly exotic data types like hyperUnique and approxHistogram columns). Timestamps (including the __time column) are stored as longs, with the value stuff the number of milliseconds since 1 January 1970 UTC. At runtime, Druid may widen 32-bit floats to 64-bit for unrepealable operators, like SUM aggregators. The reverse will not happen: 64-bit floats are not be narrowed to 32-bit. Druid often treats NULLs and empty strings interchangeably, rather than equal to the SQL standard. As such, Druid SQL only has partial support for NULLs. For example, the expressions col IS NULL and col = '' are equivalent, and both will evaluate to true if col contains an empty string. Similarly, the expression COALESCE(col1, col2) will return col2 if col1 is an empty string. While the COUNT(*) aggregator counts all rows, the COUNT(expr) aggregator will count the number of rows where expr is neither null nor the empty string. String columns in Druid are NULLable. Numeric columns are NOT NULL; if you query a numeric post that is not present in all segments of your Druid datasource, then it will be treated as zero for rows from those segments. For mathematical operations, Druid SQL will use integer math if all operands involved in an expression are integers. Otherwise, Druid will switch to floating point math. You can gravity this to happen by tossing one of your operands to FLOAT. The pursuit table describes how SQL types map onto Druid types during query runtime. Casts between two SQL types that have the same Druid runtime type will have no effect, other than exceptions noted in the table. Casts between two SQL types that have variegated Druid runtime types will generate a runtime tint in Druid. If a value cannot be properly tint to flipside value, as in CAST('foo' AS BIGINT), the runtime will substitute a default value. NULL values tint to non-nullable types will moreover be substitued with a default value (for example, nulls tint to numbers will be converted to zeroes). SQL type Druid runtime type Default value Notes CHAR STRING '' VARCHAR STRING '' Druid STRING columns are reported as VARCHAR DECIMAL DOUBLE 0.0 DECIMAL uses floating point, not stock-still point math FLOAT FLOAT 0.0 Druid FLOAT columns are reported as FLOAT REAL DOUBLE 0.0 DOUBLE DOUBLE 0.0 Druid DOUBLE columns are reported as DOUBLE BOOLEAN LONG false TINYINT LONG 0 SMALLINT LONG 0 INTEGER LONG 0 BIGINT LONG 0 Druid LONG columns (except __time) are reported as BIGINT TIMESTAMP LONG 0, meaning 1970-01-01 00:00:00 UTC Druid's __time post is reported as TIMESTAMP. Casts between string and timestamp types seem standard SQL formatting, e.g. 2000-01-02 03:04:05, not ISO8601 formatting. For handling other formats, use one of the time functions DATE LONG 0, meaning 1970-01-01TossingTIMESTAMP to DATE rounds lanugo the timestamp to the nearest day. Casts between string and stage types seem standard SQL formatting, e.g. 2000-01-02. For handling other formats, use one of the time functions OTHER COMPLEX none May represent various Druid post types such as hyperUnique, approxHistogram, etc Query execution Queries without aggregations will use Druid's Scan or Select native query types. Scan is used whenever possible, as it is often higher performance and increasingly efficient than Select. However, Select is used in one case: when the query includes an ORDER BY __time, since Scan does not have a sorting feature.Teamqueries (using GROUP BY, DISTINCT, or any team functions) will use one of Druid's three native team query types. Two (Timeseries and TopN) are specialized for specific types of aggregations, whereas the other (GroupBy) is general-purpose. Timeseries is used for queries that GROUP BY FLOOR(__time TO <unit>) or TIME_FLOOR(__time, period), have no other grouping expressions, no HAVING or LIMIT clauses, no nesting, and either no ORDER BY, or an ORDER BY that orders by same expression as present in GROUP BY. It moreover uses Timeseries for "grand total" queries that have team functions but no GROUP BY. This query type takes wholesomeness of the fact that Druid segments are sorted by time. TopN is used by default for queries that group by a single expression, do have ORDER BY and LIMIT clauses, do not have HAVING clauses, and are not nested. However, the TopN query type will unhook injudicious ranking and results in some cases; if you want to stave this, set "useApproximateTopN" to "false". TopN results are unchangingly computed in memory. See the TopN documentation for increasingly details. GroupBy is used for all other aggregations, including any nested team queries. Druid's GroupBy is a traditional team engine: it delivers word-for-word results and rankings and supports a wide variety of features. GroupBy aggregates in memory if it can, but it may spill to disk if it doesn't have unbearable memory to well-constructed your query. Results are streamed when from data nodes through the usurer if you ORDER BY the same expressions in your GROUP BY clause, or if you don't have an ORDER BY at all. If your query has an ORDER BY referencing expressions that don't towards in the GROUP BY clause (like team functions) then the usurer will materialize a list of results in memory, up to a max of your LIMIT, if any. See the GroupBy documentation for details well-nigh tuning performance and memory use. If your query does nested aggregations (an team subquery in your FROM clause) then Druid will execute it as a nested GroupBy. In nested GroupBys, the innermost team is distributed, but all outer aggregations vastitude that take place locally on the query broker. Semi-join queries containing WHERE clauses like col IN (SELECT expr FROM ...) are executed with a special process. The usurer will first translate the subquery into a GroupBy to find unshared values of expr. Then, the usurer will rewrite the subquery to a literal filter, like col IN (val1, val2, ...) and run the outer query. The configuration parameter druid.sql.planner.maxSemiJoinRowsInMemory controls the maximum number of values that will be materialized for this kind of plan. For all native query types, filters on the __time post will be translated into top-level query "intervals" whenever possible, which allows Druid to use its global time alphabetize to quickly prune the set of data that must be scanned. In addition, Druid will use indexes local to each data node to remoter speed up WHERE evaluation. This can typically be washed-up for filters that involve boolean combinations of references to and functions of single columns, like WHERE col1 = 'a' AND col2 = 'b', but not WHERE col1 = col2.Injudiciousalgorithms Druid SQL will use injudicious algorithms in some situations: The COUNT(DISTINCT col) team functions by default uses a variant of HyperLogLog, a fast injudicious unshared counting algorithm. Druid SQL will switch to word-for-word unshared counts if you set "useApproximateCountDistinct" to "false", either through query context or through usurer configuration. GROUP BY queries over a single post with ORDER BY and LIMIT may be executed using the TopN engine, which uses an injudicious algorithm. Druid SQL will switch to an word-for-word grouping algorithm if you set "useApproximateTopN" to "false", either through query context or through usurer configuration. The APPROX_COUNT_DISTINCT and APPROX_QUANTILE team functions unchangingly use injudicious algorithms, regardless of configuration.VendeeAPIs JSON over HTTP You can make Druid SQL queries using JSON over HTTP by posting to the endpoint /druid/v2/sql/. The request should be a JSON object with a "query" field, like {"query" : "SELECT COUNT(*) FROM data_source WHERE foo = 'bar'"}. Results are misogynist in two formats: "object" (the default; a JSON variety of JSON objects), and "array" (a JSON variety of JSON arrays). In "object" form, each row's field names will match the post names from your SQL query. In "array" form, each row's values are returned in the order specified in your SQL query. You can use flourish to send SQL queries from the command-line: $ cat query.json {"query":"SELECT COUNT(*) AS TheCount FROM data_source"} $ flourish -XPOST -H'Content-Type: application/json' http://BROKER:8082/druid/v2/sql/ -d @query.json [{"TheCount":24433}] Metadata is misogynist over the HTTP API by querying the "INFORMATION_SCHEMA" tables. Finally, you can moreover provide connection context parameters by subtracting a "context" map, like: { "query" : "SELECT COUNT(*) FROM data_source WHERE foo = 'bar' AND __time > TIMESTAMP '2000-01-01 00:00:00'", "context" : { "sqlTimeZone" : "America/Los_Angeles" } } JDBC You can make Druid SQL queries using the Avatica JDBC driver. Once you've downloaded the Avatica vendee jar, add it to your classpath and use the connect string jdbc:avatica:remote:url=http://BROKER:8082/druid/v2/sql/avatica/. Example code: // Connect to /druid/v2/sql/avatica/ on your broker. String url = "jdbc:avatica:remote:url=http://localhost:8082/druid/v2/sql/avatica/"; // Set any connection context parameters you need here (see "Connection context" below). // Or leave empty for default behavior. Properties connectionProperties = new Properties(); try (Connection connection = DriverManager.getConnection(url, connectionProperties)) { try ( final Statement statement = client.createStatement(); final ResultSet resultSet = statement.executeQuery(query) ) { while (resultSet.next()) { // Do something } } } Table metadata is misogynist over JDBC using connection.getMetaData() or by querying the "INFORMATION_SCHEMA" tables. Parameterized queries (using ? or other placeholders) don't work properly, so stave those. Connection Stickiness Druid's JDBC server does not share connection state between brokers. This ways that if you're using JDBC and have multiple Druid brokers, you should either connect to a specific broker, or use a load balancer with sticky sessions enabled. The Druid Router node provides connection stickiness when balancing JDBC requests. Please see Router documentation for increasingly details. Note that the non-JDBC JSON over HTTP API is stateless and does not require stickiness. Connection context Druid SQL supports setting connection parameters on the client. The parameters in the table unelevated stupefy SQL planning. All other context parameters you provide will be tying to Druid queries and can stupefy how they run. See Query context for details on the possible options. Connection context can be specified as JDBC connection properties or as a "context" object in the JSON API. Parameter Description Default value sqlTimeZone Sets the time zone for this connection, which will stupefy how time functions and timestamp literals behave. Should be a time zone name like "America/Los_Angeles" or offset like "-08:00". UTC useApproximateCountDistinct Whether to use an injudicious cardinalty algorithm for COUNT(DISTINCT foo). druid.sql.planner.useApproximateCountDistinct on the usurer useApproximateTopN Whether to use injudicious TopN queries when a SQL query could be expressed as such. If false, word-for-word GroupBy queries will be used instead. druid.sql.planner.useApproximateTopN on the usurer useFallback Whether to evaluate operations on the usurer when they cannot be expressed as Druid queries. This option is not recommended for production since it can generate unscalable query plans. If false, SQL queries that cannot be translated to Druid queries will fail. druid.sql.planner.useFallback on the usurer Retrieving metadata Druid brokers infer table and post metadata for each dataSource from segments loaded in the cluster, and use this to plan SQL queries. This metadata is cached on usurer startup and moreover updated periodically in the preliminaries through SegmentMetadata queries.Preliminariesmetadata refreshing is triggered by segments inward and exiting the cluster, and can moreover be throttled through configuration. You can wangle table and post metadata through JDBC using connection.getMetaData(), or through the INFORMATION_SCHEMA tables described below. For example, to retrieve metadata for the Druid datasource "foo", use the query: SELECT * FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = 'druid' AND TABLE_NAME = 'foo' SCHEMATA tablePostNotes CATALOG_NAME Unused SCHEMA_NAME SCHEMA_OWNER Unused DEFAULT_CHARACTER_SET_CATALOG Unused DEFAULT_CHARACTER_SET_SCHEMA Unused DEFAULT_CHARACTER_SET_NAME Unused SQL_PATH Unused TABLES tablePostNotes TABLE_CATALOG Unused TABLE_SCHEMA TABLE_NAME TABLE_TYPE "TABLE" or "SYSTEM_TABLE" COLUMNS tablePostNotes TABLE_CATALOG Unused TABLE_SCHEMA TABLE_NAME COLUMN_NAME ORDINAL_POSITION COLUMN_DEFAULT Unused IS_NULLABLE DATA_TYPE CHARACTER_MAXIMUM_LENGTH Unused CHARACTER_OCTET_LENGTH Unused NUMERIC_PRECISION NUMERIC_PRECISION_RADIX NUMERIC_SCALE DATETIME_PRECISION CHARACTER_SET_NAME COLLATION_NAME JDBC_TYPE Type lawmaking from java.sql.Types (Druid extension) Server configuration The Druid SQL server is configured through the pursuit properties on the broker. Property Description Default druid.sql.enable Whether to enable SQL at all, including preliminaries metadata fetching. If false, this overrides all other SQL-related properties and disables SQL metadata, serving, and planning completely. false druid.sql.avatica.enable Whether to enable JDBC querying at /druid/v2/sql/avatica/. true druid.sql.avatica.maxConnections Maximum number of unshut connections for the Avatica server. These are not HTTP connections, but are logical vendee connections that may span multiple HTTP connections. 50 druid.sql.avatica.maxRowsPerFrame Maximum number of rows to return in a single JDBC frame. Setting this property to -1 indicates that no row limit should be applied. Clients can optionally specify a row limit in their requests; if a vendee specifies a row limit, the lesser value of the client-provided limit and maxRowsPerFrame will be used. 100,000 druid.sql.avatica.maxStatementsPerConnection Maximum number of simultaneous unshut statements per Avatica vendee connection. 1 druid.sql.avatica.connectionIdleTimeout Avatica vendee connection idle timeout. PT5M druid.sql.http.enable Whether to enable JSON over HTTP querying at /druid/v2/sql/. true druid.sql.planner.maxQueryCount Maximum number of queries to issue, including nested queries. Set to 1 to disable sub-queries, or set to 0 for unlimited. 8 druid.sql.planner.maxSemiJoinRowsInMemory Maximum number of rows to alimony in memory for executing two-stage semi-join queries like SELECT * FROM Employee WHERE DeptName IN (SELECT DeptName FROM Dept). 100000 druid.sql.planner.maxTopNLimit Maximum threshold for a TopN query. Higher limits will be planned as GroupBy queries instead. 100000 druid.sql.planner.metadataRefreshPeriod Throttle for metadata refreshes. PT1M druid.sql.planner.selectPageSize Page size threshold for Select queries. Select queries for larger resultsets will be issued back-to-back using pagination. 1000 druid.sql.planner.useApproximateCountDistinct Whether to use an injudicious cardinalty algorithm for COUNT(DISTINCT foo). true druid.sql.planner.useApproximateTopN Whether to use injudicious TopN queries when a SQL query could be expressed as such. If false, word-for-word GroupBy queries will be used instead. true druid.sql.planner.useFallback Whether to evaluate operations on the usurer when they cannot be expressed as Druid queries. This option is not recommended for production since it can generate unscalable query plans. If false, SQL queries that cannot be translated to Druid queries will fail. false Community ·  Download ·  Powered by Druid ·  FAQ ·  License  ·   ·  Except where otherwise noted, licensed under CC BY-SA 4.0