Design
Should this be a table or not?
- can you imagine the current data being expanded in the future?
- ex. imagine we had a Book table, which included 3 properties: genre1, genre2, genre3. Now if we decided that each book should have 4 potential genres, we would have to change the structure. The solution would be to have a Books table and a Genres table
- Is the data being held in a table "genericable"? If you are tempted to combine different concepts into something that has a common thread, it might be a sign that they should be different tables. Framed another way, if you find yourself in a position where it would be difficult to extend a table, it should give us pause.
- ex. Imagine we have to keep track of
InvoiceStatus
, BackOrderStatus
, ShipViaCarrier
, CreditStatus
, and CustomerStatusType
. Each of these concepts has a common thread, which is that aside from their respective PKs, they only have a single text field. We might therefore think it appropriate to combine all 5 concepts into a single generic table. However, this becomes problematic when we start to query information. What results is a necessity to JOIN ON many different fields, and to use aliases to distinguish multiple "versions" of the same table:
- if we were to look at 5 rows of a single table and noticed that 3/5 of the rows have an identical value for one of the columns, that would be a sign that the column should be its own table
- ex. we have a table called
shoes
and there is a field to describe what type of shoe it is (business, casual etc). If we were to query some records, we would find that 'business' would show up a lot in the 'type' column. This is a sign that we should create a "shoe type" table.
SELECT *
FROM Customer
JOIN GenericDomain as CustomerType
ON Customer.CustomerTypeId = CustomerType.GenericDomainId
and CustomerType.RelatedToTable = "Customer"
and CustomerType.RelatedToColumn = "CustomerTypeId"
JOIN GenericDomain as CreditStatus
ON Customer.CreditStatusId = CreditStatus.GenericDomainId
and CreditStatus.RelatedToTable = "Customer"
and CreditStatus.RelatedToColumn = "CreditStatusId"
- on the other hand, if we separate them all out as seperate tables, then it greatly simplifies our query:
SELECT * FROM Customer
JOIN CustomerType
ON Customer.CustomerTypeId = CustomerType.CustomerTypeId
JOIN CreditStatus
ON Customer.CreditStatusId = CreditStatus.CreditStatusId