Summary: in this tutorial, you’ll learn how to use the PostgreSQL SUBSTRING function to extract the substring from a string.
Introduction to the PostgreSQL SUBSTRING function #
The PostgreSQL SUBSTRING() function allows you to extract a substring of a string starting at a position (start) and stopping after n characters.
Here’s the basic syntax of the substring function:
SUBSTRING (string [FROM start] [FOR count])Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)The SUBSTRING() function accepts three parameters:
string: The input string you want to extract a substring.start: The beginning position in the input string to start extracting. Thestartis optional and defaults to 1, the beginning of the input string.count: The number of characters to extract. Thecountis optional and defaults to the length of the input string.
The SUBSTRING() function returns the substring extracted from the input string.
Basic PostgreSQL SUBSTRING function examples #
The following query uses the SUBSTRING() function to extract the domain extension:
SELECT
SUBSTRING(
'pgtutorial.com'
FROM
12 FOR 3
) extension;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Output:
extension
-----------
comCode language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)To make the query more generic, you can combine the SUBSTRING function with the POSITION function:
SELECT
SUBSTRING(
'pgtutorial.com'
FROM
POSITION('.' IN 'pgtutorial.com') + 1 FOR 3
) extension;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Output:
extension
-----------
comCode language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)In this example:
- First, the
POSITIONfunction returns the character’s starting position (.) in the domain name. - Second, the
SUBSTRINGextracts three characters starting from the character’s position (.) plus one.
The following example extracts the extension of the domain pgtutorial.com without using the count argument:
SELECT
SUBSTRING(
'pgtutorial.com'
FROM
POSITION('.' IN 'pgtutorial.com') + 1
) extension;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Output:
In this example, the SUBSTRING() function returns a substring starting from the character’s position (.) to the end of the string.
The following query uses the SUBSTRING() function to extract the domain name without extension:
SELECT
SUBSTRING(
'pgtutorial.com' FOR POSITION('.' IN 'pgtutorial.com') - 1
);Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Output:
substring
------------
pgtutorialCode language: plaintext (plaintext)In this example:
- First, the
POSITIONreturns the index of the character (.) in the domain name. - Second, the
SUBSTRINGreturns a substring from index1(default) to the location before the character (.).
Creating a user-defined function that uses the SUBSTRING function #
The following statement creates a user-defined function called extract_domain that extracts domain and extension from a domain name using the SUBSTRING function:
CREATE OR REPLACE FUNCTION extract_domain (domain_name TEXT)
RETURNS TABLE (
domain TEXT,
extension TEXT
) AS
$$
DECLARE
dot_index INT;
BEGIN
dot_index := POSITION('.' IN domain_name);
RETURN QUERY
SELECT
SUBSTRING(domain_name FROM 1 FOR dot_index - 1),
SUBSTRING(domain_name FROM dot_index + 1);
END;
$$ LANGUAGE plpgsql;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)How it works:
First, find the index of the dot character (.) using the POSITION function:
dot_index := POSITION('.' IN domain_name);Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Second, use the SUBSTRING() function to extract the domain name and extension from the domain name based on the dot index.
The following statement calls the extract_domain function to extract parts of the domain name pgtutorial.com:
SELECT
domain,
extension
FROM
extract_domain ('pgtutorial.com');Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Output:
domain | extension
------------+-----------
pgtutorial | comCode language: plaintext (plaintext)Extracting a substring based on a regular expression #
Here’s another form of the SUBSTRING() function that extracts a substring from an input string based on a regular expression:
SUBSTRING( string FROM pattern )Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)This function returns the first substring in the string that matches the regular expression.
For example, the following query uses the SUBSTRING() function to pull the domain name from the email "[email protected]":
SELECT
SUBSTRING(
'[email protected]'
FROM
'@(.*)'
) domain_name;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Output:
domain_name
----------------
pgtutorial.comCode language: plaintext (plaintext)Heres’ the breakdown of the pattern @(.*) :
@: A @ symbol.(.*): A capturing group that matches any number of characters(.*).
In short, the regular expression @(.*) matches everything after the character (@).
Extracting a substring similar to a regular expression #
The following SUBSTRING() function extracts a substring that matches a regular expression, which is similar to the LIKE operator but supports regular expression-like syntax:
SUBSTRING (string SIMILAR pattern ESCAPE escape_character)Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)In this syntax:
string: The input text.pattern: A pattern with regex-like syntax.escape_character: An optional escape character for the regular expression.
For example, the following query uses the SUBSTRING() function to retrieve the product with a name ending with two digits:
SELECT
SUBSTRING(product_name SIMILAR '%\d{2}' ESCAPE '\') AS name
FROM
products
WHERE
SUBSTRING(product_name SIMILAR '%\d{2}' ESCAPE '\') IS NOT NULL;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)Output:
name
--------------------
Samsung Galaxy S24
Apple iPhone 15
Huawei Mate 60
Xiaomi Mi 14
Dell XPS 15
HP Spectre x360
Dell Inspiron 27Code language: plaintext (plaintext)In this example, the pattern %\d{2} matches any string that ends with two digits.
Summary #
- Use PostgreSQL
SUBSTRINGfunctions to extract a substring from a string.