Difference between putting criteria in “And” in a table join, and putting criteria in “Where”

 

There is no difference if an Inner Join is used. But there is in a Left Outer Join.

 

–Test data used:

Create Table Table1 (Course Nvarchar(50) Not Null, Column1 Int Not Null)

Insert Into Table1 (Course, Column1) Values(‘CourseX’, 3)

Insert Into Table1 (Course, Column1) Values(‘CourseY’, 4)

 

Create Table Table2 (Id Int Not Null, Name Nvarchar(50) Not Null)

Insert Into Table2 (Id, Name) Values(38, ‘CourseX’)

Insert Into Table2 (Id, Name) Values(39, ‘CourseY’)

 

Select * From Table1

Select * From Table2

 

— Query 1: Example using “And”:

Select

*

From

Table1 T1

Left Outer Join Table2 T2 On T1.Course = T2.Name And T2.Id = 39

— Output:

— 3

— 4

 

— Query 2: Example using “Where”:

Select

*

From

Table1 T1

Left Outer Join Table2 T2 On T1.Course = T2.Name

Where T2.Id = 39

— Output:

— 4

 

With the criteria in the “And” as part of the Join, you will get rows even if the condition fails. You will get values from the left table and nulls for the right table.

If you include the criteria in the “Where”, and that condition fails, the row will be filtered out.

 

 

 

+++++++++++++++++++++++++++++++++++++++++++

 

 

 

Whats the difference between..

 

Select DISTINCT field1

FROM table1 cd

JOIN table2

ON     cd.Company = table2.Name

and table2.Id IN (2728)

 

And…

 

SELECT DISTINCT field1

FROM table1 cd

JOIN table2

ON cd.Company = table2.Name

where table2.Id IN (2728)

 

Both return the same result and both have the same explain output

 

Firstly there is a semantic difference.

When you have a join, you are saying that the relationship between the two tables is defined by that condition.

So in your first example you are saying that the tables are related by cd.Company = table2.Name AND table2.Id IN (2728).

When you use the WHERE clause, you are saying that the relationship is defined by cd.Company = table2.Name and that you only want the rows where the condition table2.Id IN (2728) applies.

Even though these give the same answer, it means very different things to a programmer reading your code.

 

In this case, the WHERE clause is almost certainly what you mean so you should use it.

 

Secondly there is actually difference in the result in the case that you use a LEFT JOIN instead of an INNER JOIN.

If you include the second condition as part of the join, you will still get a result row if the condition fails – you will get values from the left table and nulls for the right table.

If you include the condition as part of the WHERE clause and that condition fails, you won’t get the row at all.

 

Here is an example to demonstrate this.

 

======

 

SQL comes from relational algebra.

https://en.wikipedia.org/wiki/Relational_algebra

 

One way to look at the difference is that JOINs are operations on sets that can produce more records or less records in the result than you had in the original tables. On the other side WHERE will always restrict the number of results.

 

The rest of the text is extra explanation.

 

 

For overview of join types see article again.

https://en.wikipedia.org/wiki/Relational_algebra#Joins_and_join-like_operators

 

When I said that the where condition will always restrict the results, you have to take into account that when we are talking about queries on two (or more) tables you have to somehow pair records from these tables even if there is no JOIN keyword.

 

So in SQL if the tables are simply separated by a comma, you are actually using a CROSS JOIN (cartesian product) which returns every row from one table for each row in the other.

 

And since this is a maximum number of combinations of rows from two tables then the results of any WHERE on cross joined tables can be expressed as a JOIN operation.

 

But hold, there are exceptions to this maximum when you introduce LEFT, RIGHT and FULL OUTER joins.

 

LEFT JOIN will join records from the left table on a given criteria with records from the right table, BUT if the join criteria, looking at a row from the left table is not satisfied for any records in the right table the LEFT JOIN will still return a record from the left table and in the columns that would come from the right table it will return NULLs (RIGHT JOIN works similarly but from the other side, FULL OUTER works like both at the same time).

 

Since the default cross join does NOT return those records you can not express these join criteria with WHERE condition and you are forced to use JOIN syntax (oracle was an exception to this with an extension to SQL standard and to = operator, but this was not accepted by other vendors nor the standard).

 

Also, joins usually, but not always, coincide with existing referential integrity and suggest relationships between entities, but I would not put as much weight into that since the where conditions can do the same (except in the before mentioned case) and to a good RDBMS it will not make a difference where you specify your criteria.