This is a guest article by Rahul Varshneya, co-founder and President of Arkenea
A developer is faced with confusing tables, values, functions, triggers, and much more while developing a database design. During this process, the chances of making some common mistakes are inevitable.
Mistakes aren’t exhaustive, however, if one avoids these then the chances of developing a successful database increase.
The right database design will give less trouble during deployment, development, and performance. Hence, to get it right in one go, here is a list of nine mistakes to avert while designing a database.
1. Poor Design/Planning
The database is a vital aspect of every custom software, hence taking the time to map out the goals of database design ensures the success of any project. Consequences of lack of planning are seen further down the line and impacts projects in terms of time management.
Improper planning of the database leaves you with no time to go back and fix errors and leads to malicious cyber attacks. Therefore, consider sitting down with a paper and drawing a data model as per business requirements.
For example, coders use database schemas to incorporate database designs as they are the blueprints that help developers to visualize databases. Developers can avoid poor planning/design by checking off the following points.
- Main tables of your database model
- Names for tables
- Rules for naming tables
- Time span required for the project
These pointers help resolve essential issues within a project and skipping any of these will only delay your project.
2. Ignoring Normalization
Normalization or SQL (Structured Query Language) groups data under a single table and indirectly related data are put under separate tables. These tables are connected with a logical relationship between child and parent tables.
Lack of normalization reduces data consistency and leads to duplication of data because entire data isn’t stored in one place. Finding related data is strenuous due to lack of grouping and costs time for searching. Hence, consider implementing normalization rules during database design.
Despite following normalization rules, databases don't function as required. That’s because they need to be normalized to a third normal form, as this layout represents entities and is balanced with updating - inserting - deleting records. For example, don’t comply with 1NF, 2NF, or 3NF and redesign the entire table.
3. Redundant Records
Redundancy in a database is a condition in which the same set of data is stored at two different places. This means two different spots or two separate fields in software. This is a problem for developers because they have to keep several versions of data updated.
Redundant records lead to unnecessary growth in database size, which in turn decreases the efficiency of the database and causes data corruption. Hence, the best option is to avoid preparing redundant records, unless it's essential for backups.
Data redundancy is classified into two aspects - wasteful and excessive. Out of these, wasteful redundancy occurs when a set of data is repeated needlessly. Complicated data storing or inefficient coding results in wasteful redundancy.
For example, a table of students' attributes includes a student name, student ID, college name, and course applied. Out of these, the values of college name and course are repeated which can cause issues such as insertion anomaly, update anomaly, and deletion anomaly.
To avoid redundant records, always delete data that is no longer needed.
4. Poor Naming Standards
Naming is a personal choice, however, it is an important aspect of documentation. Poor naming standards result in messy and large data files, hence consider incorporating consistency.
The purpose of naming is to allow all future developers or programmers to easily understand the components of databases and what was their use. This saves time for developers and they need not go through documents to understand the meaning of a name.
There isn’t a universal guide to naming conventions, but it's best to avoid bad naming practices. Here are examples of unsuccessful naming conventions that one must avoid.
Underscore_for_Word_Separation
For certain display situations, an underscore is confused with a blank space in command and a developer may confuse it for a compound name with two objects. Underscore is a combination of keys, and on keyboards that are configured for international usage, the key combination is difficult to locate. Finally, an underscore makes a name longer, with a character that holds no meaning.
Meaningless or Generic Names
Names such as flags, scrap, data table, or config are ambiguous and misleading. Generic or meaningless names are a problem for new developers as they have to use and read data models all over again. Hence, for names use infinitive verbs that express actions and not static names, for example, calculate, summarize, or append.
ALL UPPER CASE
All-caps don’t allow you to use camel case and make words stick together or forces you to use underscore to separate words. This impacts your reading as well cause it becomes hard to distinguish hundreds of files in one go.
To avoid bad naming, pick one standard and stick to it; ensure that names make sense to other developers.
5. Lack of Documentation
As per a survey conducted, the second most challenging task faced by developers was poor technical documentation. Lack of documentation leads to the loss of vital information or a tedious handover process to a new programmer.
Consider documenting everything you know from day one because any documentation is better than none. Well-organized documentation throughout the project helps to wrap up everything smoothly and in turn, helps build robust software.
The goal of documentation is to provide information to the support programmer to detect bugs and fix them. Documentation starts with naming columns, objects, and tables in a database model. A well-documented data model consists of solid names, definitions on columns, tables, relationships, and check and default constraints.
It is recommended to use sample values and everything else that is to be known for a year-long project.
6. One Table to Hold All Domain Values
The next common pitfall encountered while designing a database is to prepare one table for all the same values. For example, you have a range of values for varied areas such as order status, account status, and payment status; each one of them with different values.
The first thought that comes to mind is to store all the values in one place because they are all status values. So, a table would look like -
1. Table or entity (order, account, and payment)
2. Key (1,2, or 3 for whole table/entity)
3. Value (pending, draft, paid, etc.)
This table looks simple, yet it comes with its own set of issues. This approach does not include referential integrity because there’s no simplified way to assure statuses that are applicable to a table are associated with that specific table.
For example, you can’t relate the primary key to the account table and make sure that only account statuses are chosen.
Hence, to avoid these hassles, consider creating different tables for all similar data values, for instance, order_status, account_ status, etc.
7. Ignoring Frequency or Purpose of the Data
By ignoring the fundamental purpose of data, a designer shifts away from the primary goal of storing and retrieving data efficiently when needed. Lack of purpose of data results in no knowledge of what the data represents, at what rate and how it is going to be acquired, what will be the operational volume, and how to use it.
For instance, a system where data is collected each day manually will not have the same data model where information is created in real-time. That’s because managing a few thousand of data monthly is different as compared to handling millions of them in the same time period.
Further, data volume is not the only facet to consider because the purpose of data impacts data structure, normalization, implementation, and record size of the entire system.
Clarity of purpose helps to develop database designs, record size, create entities, choose database engine management policies, and formats. By ignoring these, designs will be flawed fundamentally, even though they are mathematically and structurally right.
8. Insufficient Indexing
Insufficient indexing comes from a SQL configuration whose performance is affected due to improper, excessive, or missing indexes. In case indexes aren’t created properly, the SQL server goes through more records to retrieve the data that’s requested by the query.
A wrong index does not offer easy data manipulation and an index developed on multiple columns slows down queries instead of speeding them up. The lack of a clustered index in a table is a form of poor indexing. Execution of inserting, SELECT statement, deleting, and updating records is slower than on a clustered index.
Index efficiency is connected to the column type, for instance, indexes on INT column display the best performance, however, indexes on DATE, VARCHAR, or DECIMAL aren’t as efficient. This leads to redesigning tables with the best possible efficiency.
Overall, indexing is a complex decision because too much indexing is bad as little indexing, as it impacts the final outcome.
9. Lack of Testing
The lack of database testing fails to give information on whether the data values stored and received in the database are valid or not. Testing helps to save transaction data, avoids data loss, and prevents unauthorized access to information.
The database is essential for every type of software application, therefore testers need to know about SQL during testing. Consider testing for a banking application, and during tests a few things to note are:
1. No loss of information during the process.
2. Application stores transaction data correctly in the database and displays it accurately.
3. No aborted or partial operation data is saved by the application.
So, these were the nine common pitfalls to avoid during database design. For developers, creating a neat and tight database structure is essential for a seamless project flow. Hence, follow the above aspects for successful database creation.
Rahul Varshneya is the co-founder and President of Arkenea - a custom healthcare software development company. Rahul has been featured as a technology thought leader in numerous media channels such as Bloomberg TV, Forbes, HuffPost, Inc, among others.
Want to write an article for our blog? Read our requirements and guidelines to become a contributor.