Functions
Edit
Coalesce
The COALESCE()
function returns the first non-null value in a list.
Syntax: COALESCE(val1, val2, ...., val_n)
Technical Details
Works in: | SQL Server (starting with 2008), Azure SQL Database, Azure SQL Data Warehouse, Parallel Data Warehouse |
---|---|
- If we want to substitute a value from another column to use for the missing value in a different column, then we would use
COALESCE
. - Let’s say we want to look at product name, but if that data is missing we want it to substitute product code for the product name, and we’ll put the new data in a new column
SELECTCOALESCE(product,product_code)AS product_info
FROM MyTable
- If you remember the example in Merge: what we are asking is, if the value of warehouse is
NULL
then substitute it with ‘All warehouses’ - As you see in the image that’s what happened in the total row
SELECT COALESCE(warehouse, 'All warehouses') AS warehouse
- Here is another example
COALESCE(NULL, 'Pick Me', 'No Pick ME') AS answer
SELECT
[OUTPUT] Pick Me
Reduce
Trim
The TRIM() function removes the space character OR other specified characters from the start or end of a string.
By default, the TRIM() function removes leading and trailing spaces from a string.
Syntax
TRIM([characters FROM ]string)
Parameter Values
Parameter | Description |
---|---|
characters FROM | Optional. Specific characters to remove |
string | Required. The string to remove spaces or characters from |
SELECT TRIM(' SQL Tutorial ') AS trimmed_string
FROM
WHERE TRIM(order_state) = "PA"
- Specify the characters to be trimmed: [#! ] being # and ! and < space >
TRIM('#! ' FROM ' #SQL Tutorial! ') AS TrimmedString; SELECT
Ltrim & Rtrim
The LTRIM()
function removes leading spaces from a string.
The RTRIM()
function removes trailing spaces from a string.
Measure & Aggregate
Min & Max
What’s convenient with SQL is you can query the data using calculations and criteria, instead of extracting the data and then manipulating it, you can start data manipulation prior to connecting to the database.
- Here for example we can extract the
Min
andMax
values of a column and set them to anALIAS
right at the start
SELECT MIN (used_date) AS min_date,
MAX (used_date) AS max_date
FROM TableName;
Avg
- Calculates the
AVG
of the argument
AVG (seasonal_sales) AS avg_sales_sea SELECT
Avg/Per
- Let’s say we want to calculate the average cost per paperback book sold
- So we will need the total cost/ total units(books)
- So it’s a combination of two functions
/quantity)
SELECT AVG(cost
FROM Books_table= "paperback" WHERE book_type
Sum
- Calculates the
SUM
of the argument
SUM (seasonal_sales) AS sum_sales_sea
SELECT
# ---- OR with WHERE ----
SUM(seasonal_sales) >= projections_sales WHERE
Count
- Sums up the occurrence of the argument, or counts the number of rows that matches a specified criterion
#_____ gives us the total number of products in the table
COUNT(*) AS inventory_count
SELECT
FROM ProductTable
COUNT(order_states) AS num_states
SELECT
COUNT(games = "INDOOR") AS indoor_games
SELECT
# ---- OR WITH WHERE -----
COUNT(games) AS indoor_games
SELECT = "INDOOR"
WHERE games
# --- ANOTHER EXAMPLE WITH 2 CONDITIONALS
COUNT(games = "INDOOR") > MIN(required_games) AS min_indoor_games
SELECT
#---- OR with WHERE ------
SELECT games AS min_indoor_gamesCOUNT(games = "INDOOR") > MIN(required_games) WHERE
Count Distinct
The COUNT
function returns the number of records that are returned by a query that summed the occurrence of a variable. COUNT DISTINCT
performs the same function but removes both duplicate rows of the same data and null values from the result set.
#___ Here we don't want to count the number of orders but
#___ the distinct number of accounts that placed an order
SELECTCOUNT(DISTINCT order_id) AS num_orders
Convert
Cast
- Convert any data type to another
- Syntax:
CAST(expression AS datatype(length)
)
expression | Required. The value to convert |
datatype | Required. The datatype to convert expression to. Can be one of the following: bigint, int, smallint, tinyint, bit, decimal, numeric, money, smallmoney, float, real, datetime, smalldatetime, char, varchar, text, nchar, nvarchar, ntext, binary, varbinary, or image |
(length) | Optional. The length of the resulting data type (for char, varchar, nchar, nvarchar, binary and varbinary) |
Works in: | SQL Server (starting with 2008), Azure SQL Database, Azure SQL Data Warehouse, Parallel Data Warehouse |
---|---|
#___ change price to a floating number, ride to a string of variable length
SELECT CAST (purchase_price AS FLOAT64)
CAST (ride AS VARCHAR)
#___ change order_date from datetime to date but filter out specific date range
WHERECAST (order_date AS DATE) BETWEEN '2020-12-01' AND '2020-12-31'
#___ change and sort price as a floating numeric
ORDER BY CAST (purchase_price AS FLOAT64) DESC
Cast - BigQuery Specific
In BigQuery SQL any alteration using ALTER TABLE
is stored into a temporary table, and if you change the destination location you are given an DLL error so the only other option is to use CAST
, as seen below.
IMPORTANT: you have to select all columns in the table including the column you want to change data type. If you just choose the one column and set the destination of the query to override the existing table it will create a new table with JUST ONE COLUMN the column you
CAST
upon.
So always SELECT *
and then CAST
certain column(s)
SELECT *,
CAST(ride_id AS VARCHAR)
FROM TableName
Safe_cast
Using the CAST
function in a query that fails returns an error in BigQuery. To avoid errors in the event of a failed query, use the SAFE_CAST
function instead. The SAFE_CAST
function returns a value of Null instead of an error when a query fails.
The syntax for SAFE_CAST
is the same as for CAST. Simply substitute the function directly in your queries. The following SAFE_CAST
statement returns a string from a date.
SAFE_CAST(MyDate AS STRING)
SELECT FROM MyTable
Scalar
Scalars perform operations on every input value
Round
ROUND(number, decimals, operation)
Parameter | Description |
---|---|
number | Required. The number to be rounded |
decimals | Required. The number of decimal places to round number to |
operation | Optional. If 0, it rounds the result to the number of decimal. If another value than 0, it truncates the result to the number of decimals. Default value is 0 |
Rounds the first argument to the # of decimals specified
ROUND(235.415, 2) AS round_value
SELECT 235.420
[OUTPUT]
#___ any number other than 0 will truncate the output
ROUND(235.415, 2,1) AS round_operate
SELECT 235.410 [OUTPUT]
Round Col Values
- Of course we can round values before we filter them
- Let’s say we want to
round
the cost column in a table
SELECT ROUND(cost) FROM Petrescue_table
Length
Returns the length of a string (in bytes).
- Let’s say a field is designed to be of 5 character length, but it appears that some were more so we can search for the fields that exceeded expectations by checking the
LENGTH
of the field - They work on string type values like
VAR
andVARCHAR
SELECT customer_name
FROM CustomersLENGTH (customer_name) > 5 WHERE
Ucase & Lcase
- We can convert strings to U or L case values
SELECT UCASE(animal_type) FROM Petrescue_table
- Scalar can be used in a
WHERE
clause - In the event that we don’t know if the values are upper, lower or a mixture of the two we should convert the entire column first then use the
WHERE
clause on it
*
SELECT
FROM Petrescue_table= 'dog' WHERE LCASE(animal_type)
Chain Together
- We can have a scalar inside of another function
- If we want to know the unique type of animals in the dataset we can do this
SELECT DISTINCT(UCASE(animal_type)) FROM Petrescue_table
Summary
This is a summary table of 4 sections, so it’ll be at the end of each of these sections:
- Functions
- Subqueries
- Implicit Joins
- DateTime
Command | Syntax (MySQL/DB2) | Description | Example (MySQL/DB2) |
---|---|---|---|
COUNT | SELECT COUNT(column_name) FROM table_name WHERE condition; |
COUNT function returns the number of rows that match a specified criterion. |
SELECT COUNT(dep_id) FROM employees; |
AVG | SELECT AVG(column_name) FROM table_name WHERE condition; |
AVG function returns the average value of a numeric column. |
SELECT AVG(salary) FROM employees; |
SUM | SELECT SUM(column_name) FROM table_name WHERE condition; |
SUM function returns the total sum of a numeric column. |
SELECT SUM(salary) FROM employees; |
MIN | SELECT MIN(column_name) FROM table_name WHERE condition; |
MIN function returns the smallest value of the SELECTED column. |
SELECT MIN(salary) FROM employees; |
MAX | SELECT MAX(column_name) FROM table_name WHERE condition; |
MAX function returns the largest value of the SELECTED column. |
SELECT MAX(salary) FROM employees; |
ROUND | SELECT ROUND(2number, decimals, operation) AS RoundValue; |
ROUND function rounds a number to a specified number of decimal places. |
SELECT ROUND(salary) FROM employees; |
LENGTH | SELECT LENGTH(column_name) FROM table; |
LENGTH function returns the length of a string (in bytes). |
SELECT LENGTH(f_name) FROM employees; |
UCASE | SELECT UCASE(column_name) FROM table; |
UCASE function displays the column name in each table in uppercase. |
SELECT UCASE(f_name) FROM employees; |
LCASE | SELECT LCASE(column_name) FROM table; |
LCASE function displays the column name in each table in lowercase. |
SELECT LCASE(f_name) FROM employees; |
DISTINCT | SELECT DISTINCT column_name FROM table; |
DISTINCT function is used to display data without duplicates. |
SELECT DISTINCT UCASE(f_name) FROM employees; |
DAY | SELECT DAY(column_name) FROM table |
DAY function returns the day of the month for a given date. |
SELECT DAY(b_date) FROM employees where emp_id = 'E1002'; |
CURRENT_DATE | SELECT CURRENT_DATE; |
CURRENT_DATE is used to display the current date. |
SELECT CURRENT_DATE; |
DATEDIFF() | SELECT DATEDIFF(date1, date2); |
DATEDIFF() is used to calculate the difference between two dates or time stamps. The default value generated is the difference in number of days. |
SELECT DATEDIFF(CURRENT_DATE, date_column) FROM table; |
FROM_DAYS() | SELECT FROM_DAYS(number_of_days); |
FROM_DAYS() is used to convert a given number of days to YYYY-MM-DD format. |
SELECT FROM_DAYS(DATEDIFF(CURRENT_DATE, date_column)) FROM table; |
DATE_ADD() | SELECT DATE_ADD(date, INTERVAL n type); |
DATE_ADD() is used to calculate the date after lapse of mentioned number of units of date type, i.e. if n=3 and type=DAY, the result is a date 3 days after what is mentioned in date column. The type valiable can also be months or years. |
SELECT DATE_ADD(date, INTERVAL 3 DAY);; |
DATE_SUB() | SELECT DATE_SUB(date, INTERVAL n type); |
DATE_SUB() is used to calculate the date prior to the record date by mentioned number of units of date type, i.e. if n=3 and type=DAY, the result is a date 3 days before what is mentioned in date column. The type valiable can also be months or years. |
SELECT DATE_SUB(date, INTERVAL 3 DAY);; |
Subquery | SELECT column_name [, column_name ] FROM table1 [, table2 ] WHERE column_name OPERATOR (SELECT column_name [, column_name ] FROM table1 [, table2 ] [WHERE]) |
A subquery is used to return data that will be used in the main query as a condition to further restrict the data to be retrieved. |
|
Implicit Inner Join | SELECT column_name(s) FROM table1, table2 WHERE table1.column_name = table2.column_name; |
Implicit Inner Join combines two or more records but displays only matching values in both tables. Inner join applies only the specified columns. |
SELECT * FROM employees, jobs where employees.job_id = jobs.job_ident; |
Implicit Cross Join | SELECT column_name(s) FROM table1, table2; |
Implicit Cross Join is defined as a Cartesian product where the number of rows in the first table is multiplied by the number of rows in the second table. |
SELECT * FROM employees, jobs; |