1 - Review of SQL from DBS610

Optional Addition Suggested Reading Materials


Welcome to Week 1

Welcome to week 1 of 710, Advanced Database Systems.  In this course students will be introduced to data management concepts, database systems and database application programming using Microsoft SQL Server.  Students expand upon their SQL knowledge from DBS610 by learning more about transactions, aggregation, set operators, sub-queries, sequences, and autonumbering. Furthermore, students dive into the realm of Procedural SQL learning about Anonymous Blocks, User-Defined Functions, Stored Procedures, and Triggers. Lastly the students learn about NoSQL Databases by using MongoDB to practice basic concepts of data manipulation through NoSQL.

Week 1 acts as a review of concepts learned in DBS610 in addition expanding upon this knowledge of key concepts used through the next few weeks of the course. After this week you should be able to:


SQL and SQL Standardization

What is SQL? SQL stands for Structured Query Language SQL lets you access and manipulate databases.

 SQL became a standard of the American National Standards Institute (ANSI) in 1986, and of the International Organization for Standardization (ISO) in 1987

What Can SQL do?   SQL can:

SQL is a Standard - BUT.... Although SQL is an ANSI/ISO standard, there are different versions of the SQL language. However, they all DBMSs support at least the major commands (such as SELECT, UPDATE, DELETE, INSERT, CREATE) in a similar manner.  It should also be noted that most of the DBMSs also have their own proprietary extensions in addition to the SQL standard!  This course uses Oracle as it's DBMS, but will try to remain agnostic with respect to the teaching and learning of the SQL language to maximize the learner's ability to apply their learned knowledge to a different DBMS in the future.

Sub-Languages of SQL

There are several sub-categories of SQL that play specific roles in the querying and manipulation of databases.

DML Data Manipulation Language This set of SQL statements are used to query or change the data stored in the database.  DML common statements include SELECT, INSERT, UPDATE, and DELETE.  These commands generally only impact rows of data in existing tables.  These commands are about "What" data is stored in the database.
DDL Data Definition Language This set of SQL statements are used to create or modify the structure of the database.  "How" is data stored in the database?  These commands generally only impact tables and columns and other stored objects in the database.  The main statements included in DDL are CREATE, CREATE OR REPLACE, and ALTER.
TCL Transaction Control Language TCL is a subset of statements that assist in controlling one or more SQL DML statements that need to either completely succeed or entirely fail.  A transaction must be complete or not happen at all.  Common commands include BEGIN, COMMIT, SAVEPOINT and ROLLBACK
PL/SQL Procedural Language extensions to SQL PL/SQL is an Oracle specific advanced compiled procedural language extension to SQL that allows for complex processes to be performed on data or databases.  PL/SQL contains many of the operations that we would find in most programming languages, such as loops, functions, variables, conditional statements, etc.  This topic is not covered until DBS501 or DBS511.

DML Basics Review

Basic SELECT coding

Recall the basic syntax of the SELECT statement.

            SELECT <comma separated fields list> 
            FROM <table name>
            WHERE <filter condition(s)>
            ORDER BY <comma separated fields list> 
        

The "*" symbology

Recall the use of the special character "*" that indicates that the statement should return all columns in the table, in the order they were created. Note that the asterisk symbol makes no suggestion to which rows to return and specifically does NOT state that all rows should be returned: that is reserved for the WHERE clause.

            SELECT * 
            FROM <table name>
            WHERE <filter condition(s)>
            ORDER BY <comma separated fields list> 
        

The asterisk is a symbol that is not typically used very often in finished database systems. This type of statement is typically only reserved for investigating the data and reviewing the table structure. It is best practice to specify the specific columns required, in the order they are required. Therefore, your output remains predictable for associated software, regardless of future addition of new fields, for example.

Use of Quotes in SQL

In SQL there are both single quotes ( ' ) and double quotes ( " ). Unlike other languages, such as JavaScript, the two different quote types actually define different things.

            SELECT
                first_name AS "FirstName",
                last_name AS "LastName"
            FROM employees
            WHERE first_name = 'John';
        

note: the double quotes are optional for field aliases if the alias name is in the singular format (i.e single word of multiple words with no spaces).

In MS SQL only, double quotes are often replaced by using square brackets, but this is specific only to MS SQL, so it is best practice to use double quotes which is more universally accepted.

            SELECT
                [first_name] AS [FirstName],
                [last_name] AS [LastName]
            FROM [employees]
            WHERE [first_name] = 'John';
            -- square brackets should be avoided to write more universally generic SQL
        

DISTINCT keyword

Often it is required to get a list of values from a column or row where repeating values are not required. The distinct keyword can be used to ensure that rows returned in a statement contain distinct values.

            SELECT DISTINCT city
            FROM customers
            ORDER BY city;
        

returns a list of cities from which customers live without repeating cities from which multiple customers are found.

Order of Execution

There is an order to the execution of the statement that must be understood, as it has consequences to how the statement can be used and as statements get much more complex, it is important to understand how they are executed.

For the four basic parts of the statement learned in the previous course, the order of execution is:

  1. FROM
  2. WHERE
  3. SELECT
  4. ORDER BY

In English, we might say "FROM this source, WHERE something is true, SELECT these fields and calculations, and ORDER BY the output by these fields."

We will refer back to the Order of Execution many times through this course when the lesson is impacted by the order of execution.

The last point to consider it that the SELECT statement executes as a loop for each row of data in the source data.

Order of Precedence

When preforming operations within SQL, especially in calculations, the traditional order of operations from mathematics, knows as BEDMAS, still applies. However, SQL includes addition operators and components that mean that the order of operations need to be expanded. This is called the Order of Precedence.

When a complex expression has multiple operators, operator precedence determines the sequence of operations. The order of execution can significantly affect the resulting value.

The following table provides a list of operators and the order in which they are processed.

When two operators are present that have the same order of precedence, they are evaluated in the order they occur, generally left to right.

Write enabled CRUD statements

A reminder that the term CRUD, refers to the 4 main DML statements.

So far we have covered the very basics of the SELECT statement, so let us review the other 3 statements.

Standard INSERT Form

The standard INSERT statement is formed as:

INSERT INTO <tablename> ( <fieldlist comma separated> )
    VALUES ( <Value list comma separated> );

-- example
INSERT INTO offices (officecode, phone, city, addressline1, state, country, postalcode, territory)
    VALUES (8, '+1 905 555 1212', 'Toronto', '200 Young St. N., 'ON', 'Canada', 'M4A3A1', 'NA);

Notes:

Short-Cut INSERT Statement

The short-cut form of the INSERT statement is:

INSERT INTO <tablename> ( <Value list comma separated> );

-- example
INSERT INTO offices VALUES (8, 'Toronto', '+1 905 555 1212', '200 Young St. N., NULL, 'ON', 'Canada', 'M4A3A1', 'NA);

Notes:

Inserting Multiple Rows in One Statement

It is possible to insert more than one row manually in a single statement.  It is important to note that this statement has a variety of forms in different DBMSs.

INSERT INTO <tablename> VALUES 
    ( <value list comma separated >),
    ( <value list comma separated >),
    ( <value list comma separated >);
-- a maximum of 1000 rows per statement
-- Example:
INSERT INTO offices VALUES 
    (8, 'Toronto', '+1 416 555 1111', '200 Young St. N., NULL, 'ON', 'Canada', 'M4A3A1', 'NA),
    (9, 'Oshawa', '+1 905 555 2222', '155 Simcoe. S., NULL, 'ON', 'Canada', 'N2L3G4', 'NA),
    (10, 'Montreal', '+1 268 555 3333', '1245 Rue Lavac, NULL, 'QC', 'Canada', 'K3S2H4', 'NA);

Inserting Multiple Rows From another Table

It is possible to insert data into a table using data from another table.  Use this feature with caution as it absolutely will be creating redundant and repetitive data if used incorrectly.  This is typically only used when creating tables through migration of an old database to a new database or importing data into live tables from temporary tables.

INSERT INTO <tablename> (<fieldlist>) 
    SELECT <fieldlist> 
    FROM <tablename>
    WHERE <condition>         -- optional
    ORDER BY <fieldlist>;     -- optional

Notes:

UPDATE - Updating Existing Data

Updating an existing row or rows is possible through the UPDATE statement. The basic form of UPDATE is:

UPDATE <tablename> 
    SET <field> = <value>,
        <field> = <value>,
        <field> = <value>,
WHERE <condition>;   -- optional,

It is almost ALWAYS necessary to include a WHERE clause to ensure only the rows that are to be updated are updated and no other rows accidentally. If a single row is to be updated, it is best practice to always use the primary key in the WHERE clause

The update statement can update a single field, multiple fields base on the SET part and can update several rows at the same time based on the WHERE part. The optional WHERE part is used in most cases as it is important to very cleary indicate which rows are to be updated. For intance:

UPDATE employees SET lastname = 'Smith';   -- BAD, updates ALL employees

would update every employee in the companies lastname to 'Smith' rather than the one woman who got married on the weekend. OOPS!

UPDATE employees SET lastname = 'Smith'
WHERE lastname = 'Jones';   -- BAD, updates all people with last name Jones to Smith

If the woman is changing her name from Jones to Smith, this statement may seem reasonable, but there is no guarantee that there is not more than one employee with the last name Jones. This statement would change all the Jone's to Smith's. OOPS!

Using the PRIMARY KEY is the best way to clearly indicate the exact row that needs to be updated as by definition it is unique.
UPDATE employees SET lastname = 'Smith'
WHERE employeeId = 343;   -- BEST, Primary Key is unique

DELETE - Deleting Existing Data

Although the most dangerous of the CRUD statements, the DELETE statement is also the simplist.  The basic form of the DELETE statement is:

DELETE FROM <tablename>
WHERE <condition> -- optional 

It is almost ALWAYS necessary to include a WHERE clause to ensure only the rows that are to be deleted are removed and no other rows accidentally. If a single row is to be deleted, it is best practice to always use the primary key in the WHERE clause

There is no fieldlist or complexities in the delete statement. It's intention is to delete entire ros and all values in the row(s). If a WHERE clause is not included, this statement will delete all rows from the table indicated; potential disaster.  Therefore, it is highly recommend that in most cases, the WHERE clause is indicated and uses a clear indicator field, such as the Primary Key.

Joins

As a review of join we must recall that joins are the process in which we are able to connect multiple tables together in order to retrieve data from those tables, or perform calculations or comparisons on data from multiple tables.

Types of Joins

There are many kinds of joins in SQL, however, we will cover the 4 most popular here and leave the more advanced joins for a later course. The 4 types we will review here are all inner or outer joins.

Let us use the following tables to understand basic joins.

SPORTS TEAMS TABLES
PLAYERS   TEAMS
PlayerID firstname lastname teamid teamID teamName shirtColor homeField
1 John Smith 10   10 Hornets Yellow Toronto
2 Bob Marley 10 11  Falcons Brown Barrie
3 Steven King 11 12 Bloopers Red Kitchener
4 Jim Parsons NULL 13 Kings Purple Oshawa

Using the above tables, we will investigate some JOINS but code example

Join Type Description Example Results
Inner Will return all rows where the equality, teamID in this case, in both tables EXACTLY match
SELECT firstname, lastname, teamName
FROM players INNER JOIN teams 
    USING (teamID)
ORDER BY teamName, lastname;
FirstName LastName TeamName
John Smith Hornets
Bob Marley Hornets
Steven King Falcons
Left Outer Will return all rows of the INNER join, but will ensure all rows from the table on the left of the word JOIN, players in this case, are included regardless of the equality.
SELECT firstname, lastname, teamName
FROM players LEFT OUTER JOIN teams
    USING (teamID)
ORDER BY teamName, lastname;
FirstName LastName TeamName
John Smith Hornets
Bob Marley Hornets
Steven King Falcons
Jim Parsons NULL
Right Outer Will return all rows of the INNER join, but will ensure all rows from the table on the right of the word JOIN, teams in this case, are included regardless of the equality.
SELECT firstname, lastname, teamName
FROM players RIGHT OUTER JOIN teams
    USING (teamID)
ORDER BY teamName, lastname;
FirstName LastName TeamName
John Smith Hornets
Bob Marley Hornets
Steven King Falcons
NULL NULL Bloopers
NULL NULL Kings
Full Outer Will return all rows of the INNER join, but will ensure all rows from both tables are included regardless of the equality.
SELECT firstname, lastname, teamName
FROM players FULL OUTER JOIN teams
    USING (teamID)
ORDER BY teamName, lastname;
FirstName LastName TeamName
John Smith Hornets
Bob Marley Hornets
Steven King Falcons
Jim Parsons NULL
NULL NULL Bloopers
NULL NULL Kings

Single Line Functions

CONVERT()

The convert function allows you to convert data between types. This is often used with manual date entry as well as with string concatenation in parameters. You can use the CONVERT function to explicitly cast as expression from one data type to another data type. You can also use the optional style parameter to choose a specific format for the returned value.

Syntax of CONVERT

CONVERT(data type, expression [,style])

An example SELECT statement using the CONVERT function

SELECT 
    CONVERT(varchar, hireDate) AS DateString,
    CONVERT(varchar, hireDate, 1) AS DateString_1,
    CONVERT(varchar, hireDate, 107) AS DateString_107,
    CONVERT(varchar, monthlySalary) AS NumberString,
    CONVERT(varchar, monthlySalary) AS NumberString2
FROM employees; 

Common style codes used for coding date/time data to string data

Code Output Style
0 or 100 Mon dd yyyy hh:miAM/PM
01 or 101 mm/dd/yy or mm/dd/yyyy
7 or 107 Mon dd yy or Mon dd, yyyy
8 or 108 hh:mi:ss
10 or 110 mm-dd-yy or mm-dd-yyyy
12 or 112 yymmdd or yyyymmdd
14 or 114 hh:mi:ss:mmm (24 hr clock)

TRY_CONVERT() function

The TRY_CONVERT() function behaves exactly the same as the CONVERT function except that if the conversion fails, the function returns a NULL value rather than throwing an error.

ISNULL() Function

The ISNULL() function allows for the replacement of null values to a value of choice. It is a common occurrence when values in a database are left as NULL. In cases where we would calculate values based on potential NULL values, calculations in general will fail. Therefore having a way to replace NULL values with an actual value is important.

SELECT ISNULL(commissionPercent,0) AS comPerc
FROM employees;

Other Single-Line Functions

The following links will provide an extensive library of functions that can be used in MS SQL.

String Functions - Microsoft Documents - String Functions (Transact-SQL)

Date and Time Functions - Microsoft Documents - Date and Time Functions (Transact-SQL)

Numeric Functions - Microsoft Documents - Decimal and Numeric Functions (Transact-SQL)