Database normalization

Database normalization 


Database normalization is a technique to organize the contents of databases. Without normalization, database systems can be inaccurate, slow, and inefficient. The community of database professionals developed a series of guidelines for the normalization of databases. Each 'level' of normalization is referred to as a form, and there are 5 forms, total. First normal form is the lowest level of normalization, up to fifth normal form, which is the highest level of normalization.
  • First Normal Form (1NF): The most basic level of data normalization, first normal form requires the elimination of all duplicate columns in a table, and also requires the creation of separate tables for related data, and identification of each table with a primary key attribute.

  • Second Normal Form (2NF): Meets all the requirements of first normal form, and creates relationships between tables using foreign keys.

  • Third Normal Form (3NF): Meets all the requirements of second and first normal forms, and removes all columns that are not dependent upon the primary key. Third normal form also removes all derived attributes, such as age.

  • Fourth Normal Form (4NF): Fourth normal form adds one additional requirement, which is the removal of any multi-valued dependencies in relationships.

  • Fifth Normal Form (5NF): Fifth normal form is a rarer form of normalization, in which case join dependencies are implied by candidate keys (possibly primary key values).

In the reality of database development, getting to 3NF is the most important jump. 4NF and 5NF are a bit more of a luxury (and sometimes a nuisance) in database development, and are rarely seen in practice. If you're struggling with the concepts, or remembering the first three forms, there is a simple relation. "The key, the whole key, and nothing but the key.", which relates to 1NF, 2NF, and 3NF.

The Benefits of Normalization


Now, without venturing too far into database theory, let's simply focus on the benefits of normalization. As the data progresses through the normalization forms, it becomes cleaner, better organized, and faster. Now, with a small database that has only 5 tables and 100 rows of data, this won't be readily apparent. However, as the database grows, the effects of normalization will become much more apparent with regards to speed and maintaining data integrity. However, there are some situations in which normalization doesn't make sense, such as when normalizing the data will create excessively complex queries required to return the data.

Understand Data Types IN SQL


Understand Data Types IN SQL
In SQL, typically each table column has an associated data type. Text, Integer, VarChar, Date, and more, are typically available types for developers to choose from.
  • MySQL Data Types
  • Oracle Data Types
  • SQL Server Data Types

When developing, make sure you choose the proper data type for the column. Dates should be DATE variables, numbers should be a numeric type, etc. This becomes especially important when we deal with a later topic: indexing; but I'll demonstrate an example of poor knowledge of data types below:
SELECT employeeID, employeeName
FROM employee WHERE employeeID = 5142875;

Looks fine based on what we currently know, correct? However, what if employeeID is actually a string. Now we've got a problem, because the DBMS might not find a match (because string data types and integers are different types).

Therefore, if you're using indexing, you'll probably be perplexed as to why your query is taking forever, when it should be a simple index scan. This is the reason that developers need to pay special attention to data types and their applications. Non-key attributes which are IDs are often string types, as opposed to integers, because of the increased flexibility that is granted. However, this is also a trouble area for junior developers, who assume that ID fields will be integers.

Understand Joins in SQL

Understand Joins in SQL

Single table SQL select statements are rather easy to write. However, business requirements often dictate that more complex queries must be written. For example, "find all orders for each customer, and display the products for each order". Now, in this particular situation, it would be likely that there is a customer table, an order table, and an order_line table (the last would be to resolve a possible many-to-many record relationship). For those who are slightly more familiar with SQL, it is readily apparent that a table join, actually, two table joins will be required for this query. Let's look at some sample code.





SELECT customer.customerID, order.order_id, order_line.order_item
FROM customer
    INNER JOIN order
        ON customer.customerID = order.customerID
    INNER JOIN order_line
        ON order.orderID = order_line.orderID;

Alright, simple enough. For those who don't know, the code above is an inner join. More specifically, the code above is an equi-join.
Let's define the various types of joins.
Inner Joins: The basic purpose of inner joins is to return matching records.
Outer Joins: Outer joins do not require each record to have a matching record.
  • Left outer join: A left outer join of tables A and B will return all matching records of A and B, as well as any non-matched records from the left table, in this case, A.
  • Right outer join: A right outer join of tables A and B will return all matching records of A and B, as well as any non-matched records from the right table, in this case, B.
  • Full outer join: A full outer join of tables A and B will return all matching records of A and B, as well as any non-matched records from both tables.

Self Joins

·         There is one last type of join that must be considered, which is a self join. A self join is merely a join from a table to itself.



EMPLOYEE TABLE
-EmployeeName
-SupervisorID

·         In this situation, in order to find which employees are supervised by a given employee, a self join would be required.

·         Hopefully this clarifies the basic tenets of joins, as they are one of the primary features of SQL that makes it such a powerful database language. Make sure you use the proper join for your given situation.

what is SQL UNION clause?


The SQL UNION clause
The SQL UNION clause is used to combine the results of two or more SELECT statements without returning any duplicate rows.
To use UNION, each SELECT must have the same number of columns selected, the same number of column expressions, the same data type, and have them in the same order, but they do not have to be the same length.

Syntax:


The basic syntax of UNION is as follows:
SELECT column1 [, column2 ]
FROM table1 [, table2 ]
[WHERE condition]

UNION

SELECT column1 [, column2 ]
FROM table1 [, table2 ]
[WHERE condition]

The UNION ALL Clause:

The UNION ALL operator is used to combine the results of two SELECT statements including duplicate rows.

There are two other clauses (i.e., operators), which are very similar to UNION clause:
·        SQL INTERSECT Clause: is used to combine two SELECT statements, but returns rows only from the first SELECT statement that are identical to a row in the second SELECT statement.
·        SQL EXCEPT Clause : combines two SELECT statements and returns rows from the first SELECT statement that are not returned by the second SELECT statement.


What is Self-Join in SQL


What is Self-Join in SQL?

A self-join is a join in which a table is joined with itself (which is also called Unary relationships), especially when the table has a FOREIGN KEY which references its own PRIMARY KEY. To join a table it-self means that each row of the table is combined with itself and with every other row of the table.
The self-join can be viewed as a join of two copies of the same table. The table is not actually copied, but SQL performs the command as though it were.
The syntax of the command for joining a table to itself is almost same as that for joining two different tables. To distinguish the column names from one another, aliases for the actual the table name are used, since both the tables have same name. Table name aliases are defined in the FROM clause of the SELECT statement. See the syntax:

SELECT a.column_name, b.column_name…..
FROM table1 a, table1 b
WHERE a.common_filed = b.common_field;

Example of SQL SELF JOIN

In the following example we will use the table EMPLOYEE twice and in order to do this we will use the alias of the table.
To get the list of employees and their Manager the following sql statement has used:

SELECT a.emp_id AS "Emp_ID",a.emp_name AS "EmployeeName", 
b.emp_id AS "Manager ID",b.emp_name AS "ManagerName" 
FROM employee a, employee b 
WHERE a.emp_supv = b.emp_id;

Why we can't execute a stored procedure from a User Defined function(UDF)

Functions cannot "touch" any database but read them only. Stored procedures can do anything and everything with databases. You ...