Functions


Edit


Coalesce

The COALESCE() function returns the first non-null value in a list.

Syntax: COALESCE(val1, val2, ...., val_n)

Technical Details

Works in: SQL Server (starting with 2008), Azure SQL Database, Azure SQL Data Warehouse, Parallel Data Warehouse
  • If we want to substitute a value from another column to use for the missing value in a different column, then we would use COALESCE.
  • Let’s say we want to look at product name, but if that data is missing we want it to substitute product code for the product name, and we’ll put the new data in a new column
SELECT
     COALESCE(product,product_code)AS product_info
FROM
    MyTable
  • If you remember the example in Merge: what we are asking is, if the value of warehouse is NULL then substitute it with ‘All warehouses’
  • As you see in the image that’s what happened in the total row
SELECT 
    COALESCE(warehouse, 'All warehouses') AS warehouse

  • Here is another example
SELECT  COALESCE(NULL, 'Pick Me', 'No Pick ME') AS answer

[OUTPUT] Pick Me

Reduce


Trim

The TRIM() function removes the space character OR other specified characters from the start or end of a string.

By default, the TRIM() function removes leading and trailing spaces from a string.

Syntax

TRIM([characters FROM ]string)

Parameter Values

Parameter Description
characters FROM Optional. Specific characters to remove
string Required. The string to remove spaces or characters from
SELECT   
        TRIM('   SQL   Tutorial ')  AS trimmed_string
FROM
WHERE 
        TRIM(order_state) = "PA"
  • Specify the characters to be trimmed: [#! ] being # and ! and < space >
SELECT TRIM('#! ' FROM '    #SQL Tutorial!    ') AS TrimmedString;

Ltrim & Rtrim

The LTRIM() function removes leading spaces from a string.

The RTRIM() function removes trailing spaces from a string.

Measure & Aggregate


Min & Max

What’s convenient with SQL is you can query the data using calculations and criteria, instead of extracting the data and then manipulating it, you can start data manipulation prior to connecting to the database.

  • Here for example we can extract the Min and Max values of a column and set them to an ALIAS right at the start
SELECT 
        MIN (used_date) AS min_date,
        MAX (used_date) AS max_date
FROM    
        TableName;

Avg

  • Calculates the AVG of the argument
SELECT    AVG (seasonal_sales) AS avg_sales_sea 

Avg/Per

  • Let’s say we want to calculate the average cost per paperback book sold
  • So we will need the total cost/ total units(books)
  • So it’s a combination of two functions
SELECT  AVG(cost/quantity)
FROM    Books_table
WHERE   book_type = "paperback"

Sum

  • Calculates the SUM of the argument
SELECT    SUM (seasonal_sales) AS sum_sales_sea 

# ---- OR with WHERE ----
WHERE   SUM(seasonal_sales) >= projections_sales

Count

  • Sums up the occurrence of the argument, or counts the number of rows that matches a specified criterion
#_____ gives us the total number of products in the table
SELECT   COUNT(*) AS   inventory_count       
FROM     ProductTable


SELECT   COUNT(order_states) AS num_states


SELECT   COUNT(games = "INDOOR") AS indoor_games

# ---- OR WITH WHERE -----
SELECT   COUNT(games) AS indoor_games
WHERE    games = "INDOOR"

# --- ANOTHER EXAMPLE WITH 2 CONDITIONALS
SELECT   COUNT(games = "INDOOR") > MIN(required_games) AS min_indoor_games

#---- OR with WHERE ------
SELECT   games AS min_indoor_games
WHERE    COUNT(games = "INDOOR") > MIN(required_games)

Count Distinct

The COUNT function returns the number of records that are returned by a query that summed the occurrence of a variable. COUNT DISTINCT performs the same function but removes both duplicate rows of the same data and null values from the result set.

#___ Here we don't want to count the number of orders but
#___ the distinct number of accounts that placed an order
SELECT
     COUNT(DISTINCT order_id) AS num_orders

Convert


Cast

  • Convert any data type to another
  • Syntax: CAST(expression AS datatype(length))
expression Required. The value to convert
datatype Required. The datatype to convert expression to. Can be one of the following: bigint, int, smallint, tinyint, bit, decimal, numeric, money, smallmoney, float, real, datetime, smalldatetime, char, varchar, text, nchar, nvarchar, ntext, binary, varbinary, or image
(length) Optional. The length of the resulting data type (for char, varchar, nchar, nvarchar, binary and varbinary)
Works in: SQL Server (starting with 2008), Azure SQL Database, Azure SQL Data Warehouse, Parallel Data Warehouse
#___ change price to a floating number, ride to a string of variable length
SELECT 
        CAST (purchase_price AS FLOAT64)
        CAST (ride AS VARCHAR)

#___ change order_date from datetime to date but filter out specific date range        
WHERE
        CAST (order_date AS DATE) BETWEEN '2020-12-01' AND '2020-12-31'
        
#___ change and sort price as a floating numeric
ORDER BY  
        CAST (purchase_price AS FLOAT64)  DESC

Cast - BigQuery Specific

In BigQuery SQL any alteration using ALTER TABLE is  stored into a temporary table, and if  you change the  destination location you are given an DLL error so the only other option is to use CAST, as seen below.

IMPORTANT: you have to select all columns in the table including the column you want  to change data type. If you just choose the one column and set the destination of the query to override the existing table it will create a new table with JUST ONE COLUMN the column you CAST upon.

So always SELECT * and then CAST certain column(s)

SELECT 
       *,
       CAST(ride_id AS VARCHAR)
FROM 
       TableName

Safe_cast

Using the CAST function in a query that fails returns an error in BigQuery. To avoid errors in the event of a failed query, use the SAFE_CAST function instead. The SAFE_CAST function returns a value of Null instead of an error when a query fails.

The syntax for SAFE_CAST is the same as for CAST. Simply substitute the function directly in your queries. The following SAFE_CAST statement returns a string from a date.

SELECT   SAFE_CAST(MyDate AS STRING)
FROM     MyTable

Scalar


Scalars perform operations on every input value

Round

ROUND(number, decimals, operation)

Parameter Description
number Required. The number to be rounded
decimals Required. The number of decimal places to round number to
operation Optional. If 0, it rounds the result to the number of decimal. If another value than 0, it truncates the result to the number of decimals. Default value is 0

Rounds the first argument to the # of decimals specified

SELECT  ROUND(235.415, 2) AS round_value
[OUTPUT]  235.420

#___ any number other than 0 will truncate the output
SELECT  ROUND(235.415, 2,1)  AS round_operate
[OUTPUT]  235.410

Round Col Values

  • Of course we can round values before we filter them
  • Let’s say we want to round the cost column in a table
SELECT  ROUND(cost)
FROM    Petrescue_table

Length

Returns the length of a string (in bytes).

  • Let’s say a field is designed to be of 5 character length, but it appears that some were more so we can search for the fields that exceeded expectations by checking the LENGTH of the field
  • They work on string type values like VAR and VARCHAR
SELECT  customer_name
FROM    Customers
WHERE   LENGTH (customer_name) > 5

Ucase & Lcase

  • We can convert strings to U or L case values
SELECT  UCASE(animal_type)
FROM    Petrescue_table
  • Scalar can be used in a WHERE clause
  • In the event that we don’t know if the values are upper, lower or a mixture of the two we should convert the entire column first then use the WHERE clause on it
SELECT  *
FROM    Petrescue_table
WHERE   LCASE(animal_type) = 'dog'

Chain Together

  • We can have a scalar inside of another function
  • If we want to know the unique type of animals in the dataset we can do this
SELECT  DISTINCT(UCASE(animal_type))
FROM    Petrescue_table

Summary


This is a summary table of 4 sections, so it’ll be at the end of each of these sections:

  • Functions
  • Subqueries
  • Implicit Joins
  • DateTime
Command Syntax (MySQL/DB2) Description Example (MySQL/DB2)
COUNT SELECT COUNT(column_name) FROM table_name WHERE condition; COUNT function returns the number of rows that match a specified criterion. SELECT COUNT(dep_id) FROM employees;
AVG SELECT AVG(column_name) FROM table_name WHERE condition; AVG function returns the average value of a numeric column. SELECT AVG(salary) FROM employees;
SUM SELECT SUM(column_name) FROM table_name WHERE condition; SUM function returns the total sum of a numeric column. SELECT SUM(salary) FROM employees;
MIN SELECT MIN(column_name) FROM table_name WHERE condition; MIN function returns the smallest value of the SELECTED column. SELECT MIN(salary) FROM employees;
MAX SELECT MAX(column_name) FROM table_name WHERE condition; MAX function returns the largest value of the SELECTED column. SELECT MAX(salary) FROM employees;
ROUND SELECT ROUND(2number, decimals, operation) AS RoundValue; ROUND function rounds a number to a specified number of decimal places. SELECT ROUND(salary) FROM employees;
LENGTH SELECT LENGTH(column_name) FROM table; LENGTH function returns the length of a string (in bytes). SELECT LENGTH(f_name) FROM employees;
UCASE SELECT UCASE(column_name) FROM table; UCASE function displays the column name in each table in uppercase. SELECT UCASE(f_name) FROM employees;
LCASE SELECT LCASE(column_name) FROM table; LCASE function displays the column name in each table in lowercase. SELECT LCASE(f_name) FROM employees;
DISTINCT SELECT DISTINCT column_name FROM table; DISTINCT function is used to display data without duplicates. SELECT DISTINCT UCASE(f_name) FROM employees;
DAY SELECT DAY(column_name) FROM table DAY function returns the day of the month for a given date. SELECT DAY(b_date) FROM employees where emp_id = 'E1002';
CURRENT_DATE SELECT CURRENT_DATE; CURRENT_DATE is used to display the current date. SELECT CURRENT_DATE;
DATEDIFF() SELECT DATEDIFF(date1, date2); DATEDIFF() is used to calculate the difference between two dates or time stamps. The default value generated is the difference in number of days. SELECT DATEDIFF(CURRENT_DATE, date_column) FROM table;
FROM_DAYS() SELECT FROM_DAYS(number_of_days); FROM_DAYS() is used to convert a given number of days to YYYY-MM-DD format. SELECT FROM_DAYS(DATEDIFF(CURRENT_DATE, date_column)) FROM table;
DATE_ADD() SELECT DATE_ADD(date, INTERVAL n type); DATE_ADD() is used to calculate the date after lapse of mentioned number of units of date type, i.e. if n=3 and type=DAY, the result is a date 3 days after what is mentioned in date column. The type valiable can also be months or years. SELECT DATE_ADD(date, INTERVAL 3 DAY);;
DATE_SUB() SELECT DATE_SUB(date, INTERVAL n type); DATE_SUB() is used to calculate the date prior to the record date by mentioned number of units of date type, i.e. if n=3 and type=DAY, the result is a date 3 days before what is mentioned in date column. The type valiable can also be months or years. SELECT DATE_SUB(date, INTERVAL 3 DAY);;
Subquery SELECT column_name [, column_name ] FROM table1 [, table2 ] WHERE column_name OPERATOR (SELECT column_name [, column_name ] FROM table1 [, table2 ] [WHERE])

Subquery is a query within another SQL query and embedded within the WHERE clause.

A subquery is used to return data that will be used in the main query as a condition to further restrict the data to be retrieved.

SELECT emp_id, f_name, l_name, salaryFROM employeeswhere salary< (SELECT AVG(salary)FROM employees);

SELECT * FROM ( SELECT emp_id, f_name, l_name, dep_id FROM employees) AS emp4all;


SELECT * FROM employees WHERE job_id IN (SELECT job_ident FROM jobs);

Implicit Inner Join SELECT column_name(s) FROM table1, table2 WHERE table1.column_name = table2.column_name; Implicit Inner Join combines two or more records but displays only matching values in both tables. Inner join applies only the specified columns. SELECT * FROM employees, jobs where employees.job_id = jobs.job_ident;
Implicit Cross Join SELECT column_name(s) FROM table1, table2; Implicit Cross Join is defined as a Cartesian product where the number of rows in the first table is multiplied by the number of rows in the second table. SELECT * FROM employees, jobs;