Author: Lewis Davison
Before I start the article let me first introduce me to you. I have been in IT field for more than 10 years. I have been working as a database designer for past 3 years. This article is the essence of the knowledge I have gained through my work.
‘To err is human' so it is not possible that you do not do mistakes. Even I do mistakes and if mistake is in database designing then you and even your client may suffer a lot.
We all are aware that database design act as the cornerstone of almost every data related project. Poor database design will contain redundant data and will be inefficient. Hence it will take time in returning results. While a good database design will result in an efficient database that is easy to process and contains no anomalies. Let's check out the list of mistakes we often do in database designing.
- Ignoring planning phase
- Poor database design
- Ignoring normalization
- Improper naming standards
- Lack of documentation
- One table to hold all domain values
- Using identity columns as the only key
- Data integrity without using SQL facilities
- Improper data access procedures
- Lack of testing
Now let's see what these mistakes can do to our database
1. Ignoring planning phase
Proper planning is the key to good database design. If you start implementing things without proper planning then things may go worst at the end. Rome was not built in a day. It takes time for things to happen and proper planning will help you develop a good database design. It is impossible to determine each and every requirement of your system but problem planning will help you identify most of the requirements and potential problems.
If you do not spend time in planning the needs of your system and how your database will fulfill them then there very much chances that your things are not done right. Either your system will be inefficient or you may lose some of the important functionalities.
2. Poor database design
A good database design does matter a lot. If your database design is good then you need not to worry whatever change you have to make. But if your database design is poor then even a minor change will take all your time. Any substantial change in your database design can have impact on the overall system. A poor database design might require you to modify a large portion of your database. If a database design is poor then to solve a minor problem you may need to start from the scratch and fix it.
3. Ignoring normalization
If your tables are not normalized then there will be redundancy problems. Normalization converts a poor database design into a better one. Normalized tables have strong design and less vulnerability to update anomalies.
One question that always comes in mind is that how much a data should be normalized. Studies have shown that in most cases normalization up to third normal form is enough but fourth and fifth normal forms are also useful.
Let us have a look at below table. This table is not normalized as you can see that it contains lots of redundant data.
studentID |
name |
dateOfBirth |
CourseID |
Course |
S1 |
John |
07-09-1982 |
C1 |
English |
S1 |
John |
07-09-1982 |
C2 |
Mathematics |
S1 |
John |
07-09-1982 |
C3 |
Computer Science |
S1 |
John |
07-09-1982 |
C4 |
Arts |
Table 3.1: Student Table
However if we normalize this table then redundancy is removed. The number of tables gets increased but it will save you from many potential problems. The normalized tables are
studentID |
name |
dateOfBirth |
S1 |
John |
07-09-1982 |
Table 3.2: Student Table
CourseID |
Course |
C1 |
English |
C2 |
Mathematics |
C3 |
Computer Science |
C4 |
Arts |
Table 3.3: Courses Table
studentID |
CourseID |
S1 |
C1 |
S1 |
C2 |
S1 |
C3 |
S1 |
C4 |
Table 3.4: Student_Courses Table
4. Improper naming standards
Naming conventions are one of the most important part in documentation. You should use such naming conventions that are not only familiar to you but if you handover your project to a client or your junior database administrator then he wont face problem in understanding things.
Some guidelines you must follow while naming tables and field names are as under:
- Do not use irrelevant names. Table or field names should be meaningful.
- Use such names that clearly and accurately identifies the table or field according to requirements.
- Do not use lengthy names. Try to use minimum words.
- Your table or field names should not contain unknown abbreviations or acronyms.
- Use the singular form of name.
- Table or field name should not identify more than one table or field characteristics.
- Avoid spaces and special characters in the names.
- Do not use reserved words in naming.
- Do not use metadata in names. It is better to use name STUDENT instead of tblSTUDENT.
- Avoid using quotation marks or identifiers in the names of tables. Instead of “STUDENT” or [STUDENT], it is better to use STUDENT.
- You should be consistent in the naming conventions. For example your student tables can be STUDENT_MARKS, STUDENT_COURSES and so on.
5. Lack of documentation
Lack of documentation will be problematic for not only you but also to the client for whom you are creating the database. It is really hard to come back again and again and ask you for even minor things in the database. All information must be properly documented. This will help others understand how the system is intended to be used. Also if there are some bugs in your work then they can be solved by others instead of bothering you again and again. Proper documentation will help you in reusability. All requirements, designing, testing, results must be documented with proper information.
6. One table to hold all domain values
It is not recommended to make a single table that contains all domain values. If you create a single table and place all domain keys into it and within each table you assign foreign keys from domain table then at the end of the day your queries will be very much complex.
7. Using identity columns as the only key
Tables in 1NF mean that each attribute represents an atomic value. In simple words all rows in a table must be uniquely identified by the primary key. An identity is the one in which a unique auto incremented integer value is added to each record. For example consider the below table
studentID |
regNo |
Name |
1 |
12111 |
John |
2 |
12112 |
Chris |
3 |
12113 |
Liana |
4 |
12114 |
Henry |
5 |
12114 |
Henry |
Table 7.1: Student Table
The above table has the unique key studentID that uniquely identifies each record. But wait! Do you see any problem? The data is repeated in last two rows. Think of the scenario when you are entering information through registration form and after pressing ‘REGISTER' button you refresh the page three times or you mistakenly add the same students data twice, or thrice or so on. The same record will be repeated in the table.
To avoid this problem always use a primary key that uniquely identifies each row. In above table you can use regNo as a primary key instead of using the identity column as primary key. Another option is to add unique constraint to the regNo column.
However identity columns are useful when you do not have an identifying column in the table and they are easy to handle in many ways.
8. Data integrity without using SQL facilities
The base rules like null values, length, uniqueness, and foreign keys should be defined in the database. In that case you can never by pass any rule as the database will not allow that.
Consider this example “A student will get 50% fee concession for his first semester”. In this case the concession amount and semester may change but the rule that student will get concession will remain the same. The business rule will be enforced by the database and design. There will be tables in database that stores concession amount, semester id and student id. Furthermore you can add constraint that concession amount should be greater than 0% and less than 100%. This will help you avoid if else checks in your code.
9. Improper data access procedures
Use stored procedures for accessing data from the database. They act as the best option for hiding database layer from users. Although they take time in coding but they act as good abstraction layer. You can give access to user to use only a particular stored procedure and not all.
Stored procedures are compiled only once and then their execution plan is reused. This provides tremendous performance boosts when called repeatedly. If stored procedures are properly made then they can be used in many programs and hence the development cycle can take less time. Consider below query
SELECT semesterID, grade
FROM STUDENT_GRADE
WHERE studentID='s00101'
The stored procedure for this query will be like this
CREATE PROCEDURE sp_studentGrade
@id VARCHAR(10)
AS
SELECT semesterID, grade
FROM STUDENT_GRADE
WHERE studentID=@id
Now you can get student data by issuing the command.
EXECUTE sp_studentGrade 's00101'
EXECUTE sp_studentGrade 's00102'
10. Lack of testing
Lack of testing may result in many problems. If you are not testing you system and at the end of the day when your database is complete and some of your functionalities are not working then you will have to search through all the pages and solve the problem.
Always do thorough testing of your database. This will reduce time of bug fixing when the user comes again and again and reports the problems to you. All queries and results must be properly tested. What happens when a particular button is pressed should be repeatedly tested. Additionally locking mechanism should be thoroughly tested. If your database design is good and thoroughly tested then queries will not take much time in execution and will be error-prone.
More Database Articles
Database Security: Step by step guideline
Is read-ahead caching right for your Oracle database?
Oracle Recovery Manager - RMAN
Great Tips on Oracle Real-Time SQL Monitoring!!
Performance Gains by Managing Space: Segments, Tablespace & Datafiles!!