Difference Between GROUP BY and ORDER BY in Simple Words
Int his article, see the difference between GROUP BY and ORDER BY.
Join the DZone community and get the full member experience.
Join For FreeFor someone who is learning SQL, one of the most common places to get stuck is when learning the GROUP BY
command. GROUP BY
and ORDER BY
are two important keywords in SQL that we use to organize data. The difference between GROUP BY
and ORDER BY
is that ORDER BY
is more simple than GROUP BY
and is usually introduced early in an SQL course.
Sometimes, people are quite confused over these two concepts (SQL ORDER BY
vs. GROUP BY
), and the reason behind this confusion is a lack of understanding of both concepts. While both of them are used to arrange data based on their values, their use cases vastly differ from each other. If we learn both these concepts properly, the confusion will naturally go away.
In this article, I used an interesting database published here. The data set contains more than 2000 craft beers and 500 breweries used in the United States. The data is included in CSV files, which you can easily import to any of your databases using a GUI tool. One of my colleagues has shown how to do so in an example of Vertabelo Academy’s survey data.
You may also enjoy: How SQL DISTINCT and ORDER BY Are Related
There are two tables in this dataset: beers
and breweries
. Let’s see the sample dataset from the beers
table to get a better picture of it.
ORDER BY
Our first topic is ORDER BY
. Before trying to understand the difference between GROUP BY
and ORDER BY
, let’s see how the SELECT
command works.
Let's SELECT
the name, city, and state from every record in the breweries
table.
|
When you execute the above command, you will get something like the following:
You can see that these results are not ordered according to their names. The default order for a SELECT
statement is non-deterministic, which means the order of results can differ anytime you run your code.
However, a non-deterministic order is usually not what we want. For example, suppose you’d like to order results by ascending or descending order of their state
column; in that case, you would need the ORDER BY
command. All you have to do is use the ORDER BY
clause like this:
|
Similarly, you can get results in the descending order of the state as well.
|
You can see the records that are organized by descending order of the state. The default order of ORDER BY
command is ascending order. So, if you didn’t use ASC
or DESC
, results will come in ascending order.
You can use ORDER BY
in two or more columns. Check the following query:
|
This query first orders results in the ascending order of the state, followed by the descending order of the city.
Now, the interesting thing is that someone who doesn’t know how to use GROUP BY
much could use ORDER BY
to analyze data. For example, suppose that you want to find out how many breweries are there in a given state
. What you could do is get the results ordered by their states. This will give you a nice grouped result from which you can manually count the number of breweries from a given state
.
Therefore, someone who doesn’t have a strong grasp of GROUP BY
might think that ORDER BY
is an excellent way to group the same data. That idea is not wrong. If you want to display data in a nice grouped way, ORDER BY
is a good bet.
GROUP BY
is not a way to display data in groups, but it is more of a way to analyze data in groups. To further comprehend SQL ORDER BY
vs. GROUP BY
, let’s have a look at GROUP BY
in more detail.
GROUP BY
In most texts, GROUP BY
is defined as a way of aggregating records by the specified columns which allow you to perform aggregation functions on non-grouped columns (such as SUM
, COUNT
, AVG
, etc). In other words, the GROUP BY
clause’s purpose is to summarize unique combinations of columns values.
A few examples will further clarify:
Let’s group beers
table based on style
column.
|
The above SQL query will generate the following output
So, let's see what has happened here. This query has returned one result from each style
value. That means SQL first makes groups from the same style
values and returns one row representing the group.
And what is the exact use of GROUP BY
? Of course, you can use GROUP BY
to find distinct values. But SQL has a DISTINCT
keyword specifically for that. The real importance of GROUP BY
can be seen when you use it with aggregate functions like SUM()
, COUNT()
. To understand it better, execute the following SQL query:
|
It will generate the following result:
Here, SQL first groups results based on the style
column. Then, it checks how many names are there in each group and returns the style
values and the count of names for each style.
If you look closely, you may notice that the results are already ordered in ascending order. This situation occasionally tricks people into believing that GROUP BY
sorts results. In actuality, there is no guarantee that GROUP BY
will display results in ascending order. If you need results in a specific order, you have to do it yourself like below:
|
Therefore, GROUP BY
works nicely along with ORDER BY
.
So, now you know how to use GROUP BY
to count a set of values belongs to a certain group. I will give you two more real-world examples to understand the use of GROUP BY
.
Think of an employee table where you store information about employees, such as their name, job role, and salary. Each employee belongs to a different department, like finance, IT, or transport. Now, suppose you want to get the sum of salaries you paid for employees based on their department. You will need to execute an SQL command like below:
SELECT
department,
SUM
(salary)
FROM
employee
GROUP
BY
department
Suppose you have an eCommerce website that sells several product types. In your database, you have a table to store information about stocks. If you want to find the count of each product type, you can use the GROUP BY with COUNT aggregate function:
SELECT
product_type,
COUNT
(product_id)
FROM
stock
GROUP
BY
product_type
Let’s take this lesson a bit further to increase your knowledge of GROUP BY
. You can also group results using two columns. For example, let’s group the beers
table based on style
and brewery_id
.
|
This SQL query will divide style
groups further using brewery_id
. When you execute it, you will get the following result:
Conclusion
In this article, I explained the difference between ORDER BY
and GROUP BY
. ORDER BY
sorts data based on data of a column. Of course, as a result of sorting, data from the same values will be brought together as a group, which will make it easy for you to manually analyze later. But GROUP BY
is the SQL way to analyze similar data.
There is no real use of GROUP BY
without aggregation functions. Sometimes, GROUP BY
can return results in sorted order, but you shouldn’t rely on it. The order in which results are returned is non-deterministic and depends on how the db engine executes the query.
After reading this article, you should be able to use GROUP BY
and ORDER BY
with more confidence. If you want to refine your knowledge of SQL ORDER BY
vs. GROUP BY
, check this practice set — it contains a special section dedicated to this subject.
Published at DZone with permission of Shanika WIckramasinghe. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments