Chapter 3 Home | Prev: 3.5 Calculated Values and Single-Line Functions | Next: 3.6.1 Wildcards

3.6 Filtering Data with the WHERE Clause

Filtering Data (WHERE)

Although we are now able to extract data from a table in the database, it is rare that all the data will always be output.  If you run a store for 20 years, you would not want to output all the sales information if you just need to know how many sales were made today, or yesterday.  Therefore, the majority of SQL SELECT statements will include a WHERE component.

    SELECT firstname, birthDate
    FROM employees
    WHERE EXTRACT(month FROM birthDate) = 5;
    SELECT firstname, birthDate
    FROM employees
    WHERE Month(birthDate) = 5;

returns all employees who are born in the month of May, regardless of the year they were born.

Comparison and Logical Operators

Symbol / Text Meaning Example
= exactly equal to (strings are case sensitive) studID = 990123456
< less than (numerically, alphabetically, chronologically salary < 5000
<= less than or equal to numProducts <= 10
> greater than salary > 5000
>= great than or equal to numProducts >= 2
<> != not equal to name != 'Bob'
IN and NOT IN In a subset of values. productID IN (12, 45, 67, 65)
ANY used with another comparator and a subset of values productID > ANY(12,45,67,76)
ALL used with another comparator and a subset of values salary >= ALL(2000, 3000, 4000)
BETWEEN and NOT BETWEEN used to determine if a value is between 2 other values salary BETWEEN (2000 and 5000)
LIKE and NOT LIKE similar to =, but enables the use of wildcards name LIKE 'B%'
IS NULL and IS NOT NULL determines is the value is NULL or not middlename IS NULL
AND determines if two expressions are both true or not salary => 2000 AND salary <= 5000
OR determines if one or the other expression are true studID = 990123456 OR studID = 9906543321

String Comparisons

When performing string comparisons, it is extremely important to know if the content stored in the database is case sensitive or not. Just because SQL is not case sensitive, does not mean string comparisons are not. The important point here is that we can NEVER ASSUME that the user entered the data consistently with respect to capitalization or that the data is entered under normal conditions.

Examples:

SQL should ALWAYS be written to include case sensitivity in the code, even when we know the database and/or fields are not currently case sensitive. This will ensure that the code will always work in the future, even if the case sensitive properties of a field are changed removing the need to upgrade or repair SQL code in the future.

In Oracle, strings are case sensitive by default and therefore in order to include all possible answers, we must write case sensitive SQL.

In MS SQL, strings are not case sensitive by default and therefore sometimes it is hard to remember to include case sensitivity in your code. Often fields like passwords, logins, and other credential type of data are changed to case sensitive data.

Solution

In order to include case sensitivity in our code, and to ensure we get all the results in searches that we should get regardless of case, we can use the Upper() and/or Lower() functions.

SELECT * FROM customers
WHERE Upper(lastName) = 'MACDONALD';

-- OR

SELECT * FROM customers
WHERE Lower(firstName) = 'macdonald';
-- NOTE: we do not put Upper() or Lower() around the string literal as we can simply type it in using the matching case
            and save the processor processing an extra unneeded function.

Date Comparisons

Dates are one of the more difficult areas of SQL for new coders to understand, but most learners understand that software is world-wide and most companies client-base is multi-cultural. Therefore, it is crucial to understand date formats and how different users in different countries may enter dates, and furthermore, how the operating system of the database server, and development machines may impact dates.

As an example, in Canada, the default accepted date format is dd/mm/yy where in the USA the default date format is mm/dd/yy. So if you were to see the date '06/08/21', what is the actual date. The short answer is 'it depends', but what does it depend on. If your database server is in Canada then this is August 6, 2021 but if the server was installed using a US keyboard, then the data would be June 8, 2021. This level of inaccuracy or doubt is absolutely unacceptable

Some new learners think that entering the data like 'May 5, 2021' would solve the problem, but ultimately it may not if the server is setup as French Canadian, then 'May' is meaningless as it would be 'Mai' in French.

Dates are Not Actually Dates

In order to understand and manipulate dates easily in all database management systems, and ultimately software as well, it is very important to understand that dates are absolutely NOT stored using any kind of month, day or year notation. Dates and times are specifically stored as a decimal value only. The digits to the left of the decimal represent the number of full days that have past since January 1, 1900. The digits to the right of the decimal represent the fraction of a day past because of time as a percentage (i.e. 0.5 would represent 12 noon, 0.25 would be 6am etc.).

Example: May 16, 1972 4:35 pm is stored as 26435.690972
Example: Jan 1, 1900 12:00 pm is stored as 1.5

Because of the storage of dates and times as decimals, there can be absolutely no misinterpretation as to which date is which, like the previous example.

Solution in SQL

In order to solve this problem in SQL and to ensure our SQL will ALWAYS run, no matter where we are in the world or where the data comes from, we can use the To_Date() (Oracle) function or the Convert() (MS SQL) function to explicitly convert any human entered dates to a date format (i.e. a decimal value) that is stored in the database.

SELECT * FROM employees
WHERE hiredate BETWEEN To_Date('050110', 'mmddyy') AND To_Date('053110','mmddyy')
-- Note: We do NOT put To_Date() around the hireDate as it is already 
--       stored in the database and therefore the server already knows its exact date.  The same 
--       is true if the sysdate() function is used, it is already a datetime and should not be 
--       converted using To_Date()
SELECT * FROM employees
WHERE hiredate BETWEEN Convert(datetime,'05/01/10', 1) AND Convert(datetime, '05/31/10', 1)

-- Note 1: datetime format 1 is the default and it is the USA standard of mm/dd/yy
-- Note 2: We do NOT put Convert() around the hireDate as it is already 
--       stored in the database and therefore the server already knows its exact date.  The same 
--       is true if the getdate() function is used, it is already a datetime and should not be 
--       converted using Convert()