Name Value Pair Design (EAV)
Posted by decipherinfosys on January 29, 2007
This is one of the design anomalies in data modeling. Infact, this even has a name for it: EAV (Entity Attribute Value). Database Guru Joe Celko has written a lot about it in newsgroups as well as in his presentations. The reason why I am blogging about this today is because I have seen this approach time and again and it is either done by novice DBAs/database developers or by application programmers who try to use their programming context in the relational world. Here are the disadvantages of this approach :
1) It is next to impossible to enforce business rule constraints like uniqueness, enforcing check constraints per attribute, adding defaults per attribute, enforcing PK/FK relationships based on attributes etc. Since the attributes are being modeled as data elements, it is very error prone since we have now opened up our attribute design to the users. All data integrity is destroyed (violation of 1NF itself). Any typo becomes a new attribute.
2) Different data-type attributes will be modeled using the string data-type for the value column and this will result into performance issues while doing comparisons since the index sorts will be an issue and not only that, an effective index design startegy cannot be formulated for this. This also increases the index size un-necessarily. In addition, it will require un-necessary cast operations like for the date/timestamp data-types, the numeric data-types with precision and a non-zero scale etc.
3) It would lead to the usage of cursor based (and dynamic SQL) approach for reads since the number of attributes is virtually un-known. In the case of complex reports or other read operations, this will be a nightmare…un-necessary CASE statements, un-necessary sub-queries with self references and un-necessary pivoting/un-pivoting of data would ensure that the reports are very slow…as your data grows, performance will get exponentially worse.
If you do have a situation where you are dealing with an entity that has thousands of attributes (I have seen requirements like these) and all of them are static i.e. once they are done and have data populated in them, they never change. In those scenarios, in order to not exceed the rowlength (like SQL 2000 has a limitation on the rowsize being 8060), one can either chose to create a hard 1:1 relationship tables or chose this design to accomodate for such requirements. Since the list of attributes is static and the data in them is static as well, a simple pivot view can be used to flatten it out. But that is the only scenario where this approach can be used/considered.
If one really needs to model a system where the attributes can be generated by the end user (which itself is questionable but can happen in certain verticles – especially those where the end user needs to use the software for planning and forecasting), then a better approach will be to use a static list of attributes (for the different data-types) and have another table for their name look up so that storage is done in a relational manner but for the display piece, one can use the static list and can call an attribute whatever they want. This is the most reasonable compromise for data integrity, performance and sufficing the need of the application. We will be writing up a “How-To” article or a whitepaper on this taking an example from the real world and demostrating how you can solve this issue in a reasonable fashion using the above approach rather than using an EAV design.
4 Responses to “Name Value Pair Design (EAV)”
Sorry, the comment form is closed at this time.