TNS
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
NEW! Try Stackie AI
Data

How to Use Data Manipulation Language (DML) in SQL

DML uses INSERT, UPDATE, DELETE and MERGE to add, update, and delete data in SQL.
Apr 3rd, 2024 11:18am by
Featued image for: How to Use Data Manipulation Language (DML) in SQL
Featured image by Rob Fuller on Unsplash.

SQL is generally seen as one of the best high-level programming languages for analyzing and manipulating data due to its easy-to-learn syntax. It’s a declarative language, so users declare what results they want, rather than how to get the results, like imperative languages such as C, Java and Python. It’s also easy to read, because its syntax is similar to the English language.

In the first part of this series, I broke down the syntax used for SQL queries. In this article, I’ll discuss the anatomy of SQL’s Data Manipulation Language (DML), which as you’d expect, is used to manipulate data.

Defining DML Elements

The Data Manipulation Language is a set of SQL statements used to add, update, and delete data. SQL used for data manipulation uses INSERT, UPDATE, DELETE and MERGE statements.

  • INSERT: Inserts data in a table by adding one or more rows to a table.
  • UPDATE: Updates one or more rows in a table.
  • DELETE: Deletes one or more rows from a table.
  • MERGE: Can be used to add (insert) new rows, update existing rows or delete data in a table, depending on whether the specified condition matches. It is a convenient way to execute one operation, where you would otherwise have to execute multiple INSERT or UPDATE statements.

Using DML

Now that you’re familiar with what the various DML statements mean, you can start using them. You can follow along with these exercises using the data model in my GitHub repository.

INSERT INTO

The INSERT INTO statement adds rows to a table. It can be used by either defining one or more rows using the VALUES clause or by inserting the result of a subquery. Take a look at the VALUES clause first:


The VALUES clause allows multiple rows to be defined by separating them with a comma (,):


To use a SQL query as input for the INSERT statement, just replace VALUES with SELECT. The columns of your table and the SELECT list must match:

Update

The UPDATE statement updates entries in a table. It has a SET clause that sets columns to a given value and a WHERE clause to specify which rows to update. You almost always want a WHERE clause for your UPDATE statement; otherwise, the UPDATE statement will update all rows in the table.


The UPDATE statement can also join other tables to update rows based on a WHERE clause condition outside of the table that is being updated. For example, say you want to adjust the population of all countries in South America by 10% more (an expression formulated as population*1.1). You can filter the rows to update based on a filter via the regions table for the countries that have the appropriate region_id for South America:

DELETE

The DELETE statement deletes rows in a table and works very similarly to the UPDATE statement. As with UPDATE, with the DELETE statement you almost always want a WHERE clause; otherwise, you will delete all rows in a table.


Also like the UPDATE statement, you can apply the same filter based on other tables’ column values:

MERGE

The MERGE statement is more sophisticated than the INSERT, UPDATE and DELETE statements. The MERGE statement allows you to conditionally insert or update (and even delete some) rows with one execution. This is most helpful when you want to load data into tables with existing rows and, for example, do not want to manually check whether a given row already exists. If it does, you would need to issue an UPDATE statement or an INSERT statement otherwise. Instead, you can write one statement with a matching condition that will do the INSERT or UPDATE automatically for you.

Imagine every night you get a file with updated data from all the countries in the world. Some countries may have reported new population numbers, and very occasionally a new country is formed. Instead of running a bunch of UPDATE statements and rerunning the corresponding INSERT statement only when an UPDATE statement returns 0 rows updated, you can do both with one MERGE statement.

First, load all the data into an empty staging table (in this example, my_tab), and from there run the MERGE statement to merge the data into the target table (in this example, the countries table):


The statement above merges data into the countries table based on matching country_id (primary key) values. If the countries table includes a row with the same country_id value as the my_tab table, then the statement just updates the population column (as seen within the WHEN MATCHED THEN UPDATE clause). If the MERGE statement doesn’t find a corresponding row with the same country_id values in the countries table, then it inserts the row with all the fields into the countries table.

The MERGE statement also provides some flexibility. Say that you just want to update the countries table but never insert into it. You can just omit the WHEN NOT MATCHED INSERT clause:

Conclusion

SQL is a powerful, widely adopted, declarative language for data processing and data manipulation. Understanding the core components of SQL and how it operates is the first step to unleashing its power on your data. You can find the data model used in this article and part one in my GitHub repository for this exercise.

Group Created with Sketch.
TNS owner Insight Partners is an investor in: Island.
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.