Object-Oriented-Like Structures in Relational Databases

Object-oriented-like structures in relational databases

Here's a solution I came up with about 10 years ago. The system that uses this design is still running, so it worked well enough to survive longer than most of my code. ;) Today I may use one of the ORM packages that Scott mentions, but there's really no huge problems just using SQL directly.

  1. Model all of your inheritance relations as joins between tables. Each table in your system will hold the attributes of a specific class.

  2. Use a synthetic object id (oid) as your primary key for all objects. A sequence generator or autoincrement column is necessary to generate oid values.

  3. All inherited classes must use the same oid type as their parent. Define the oid as a foreign key with cascaded delete. The parent table gets the autoincrement oid column and the children get plain oid columns.

  4. Queries on final classes are made on the corresponding table. You can either join all the parent class tables into the query or just lazy load the attributes you need. If your inheritance hierarchy is deep and you have many classes, an ORM package can really simplify your code. My system had less than 50 classes with a maximum inheritance depth of 3.

  5. Queries across child classes (i.e. queries on a parent class) can either lazy load the child attributes on a per-instance basis, or you can repeat the query for each child class joined with base classes. Lazy loading child attributes based on a parent class query requires you know the type of the object. You may have enough information in the parent classes already, but if not you'll need to add type information. Again, this is where an ORM package can help.

Virtual classes without member attributes can be skipped in the table structure, but you won't be able to query based on those classes.

Here's what "show me all communications with just actors of type worker" looks like.

select * from comm c, worker w where c.actor=w.oid;

If you have sub-classes of communication, and you want to immediately load all the child class attributes (perhaps your system does not allow partial construction), the easiest solution is to eager join on all the possible classes.

select * from comm c, worker w, missive m where c.actor=w.oid and c.oid=m.oid;
select * from comm c, worker w, shoutout s where c.actor=w.oid and c.oid=s.oid;

One last thing. Make sure you have a good database and correct indexes. Performance can be a serious problem if you database can't optimize these joins.

Object oriented vs. object relational database difference

The two kinds are quite similar: Classes map to tables. Columns map to fields. Instances map to rows. Object references map to foreign keys. In fact, ORMs try to map the two models as closely as possible. They are quite successful with that.

The difference is mostly in the way the database is used. The difference is in "style" and in the patterns that are used.

Object databases are basically used like you would use an ORM on top of a relational database. You can directly store objects (or object graphs) into the database without the need to write an INSERT statement. The interface is often not SQL.

It is hard to differentiate the two because they are so similar. Please leave a comment if you are interested in certain (concrete) aspects.

Database Guy Asks: Object-Oriented Design Theory?

Be careful some of the design patterns literature.

There are are several broad species of class definitions. Classes for persistent objects (which are like rows in relational tables) and collections (which are like the tables themselves) are one thing.

Some of the "Gang of Four" design patterns are more applicable to active, application objects, and less applicable to persistent objects. While you wrestle through something like Abstract Factory, you'll be missing some key points of OO design as it applies to persistent objects.

The Object Mentor What is Object-Oriented Design? page has mich of you really need to know to transition from relational design to OO design.

Normalization, BTW, isn't a blanket design principle that always applies to relational databases. Normalization applies when you have update transactions, to prevent update anomalies. It's a hack because relational databases are passive things; you either have to add processing (like methods in a class) or you have to pass a bunch of rules (normalization). In the data warehouse world, where updates are rare (or non-existent) that standard normalization rules aren't as relevant.

Consequently, there's no "normalize like this" for object data models.

In OO Design, perhaps the most important rule for designing persistent objects is the Single Responsibility Principle.

If you design your classes to have good fidelity to real-world objects, and you allocate responsibilities to those classes in a very focused way, you'll be happy with your object model. You'll be able to map it to a relational database with relatively few complexities.

Turns out, that when you look at things from a responsibility point of view, you find that 2NF and 3NF rules fit with sound responsibility assignment. Unique keys still matter. And derived data becomes the responsibility of a method function, not a persistent attribute.

Object-oriented / Relational-Hybrid Database?

You may want to research NoSQL as an alternative to a typical relational store.

Is there a relationship between Database Tables and Object Oriented Classes?

It all depends on the type of database you're using. If you're using an object oriented database (OODB), then there is no relationship, as the objects and the persisted data are the same thing. For example, if you have a Customer class, and you save it in an OODB, then that instance of the customer is what is stored in the DB.

If you are using a relational database, then the class instances, and the persisted representation of them in the DB, can be the same thing, but many times they aren't. This is because most folks use normalization to represent their data in an efficient way (in a relational DB). This means, instead of having a table per class, you can have a class represented by more than one table. In the Customer example, the tables might now be Customer (with Name, date of birth, and other properties), and Order (with order pointing to products in yet another table). The reason for this has to do with cardinality, and the ability for Customers to have more than one order. When your business logic needs this information from the DB, the data access layer's job is to map the data (called ORM) from the DB into your classes.

If you are using yet another type of DB, then there will be a different relationship between the classes (domain model) and what's persisted in the DB.

But, as far as having a name for this relationship? No, there is no name.

database structure and object oriented inheritance and multiple interfaces

As far as domain entities are concerned, the benefit of having them implement interfaces is very debatable. Some even see it as an anti-pattern.

The objects in your example share a couple of notable characteristics that should make you think twice before adding interfaces on top of them :

  • They are anemic, i.e. don't have any behavior. Interfaces are used for defining contracts, messaging protocols between objects. If there's no message to send, i.e. no behavior to use, they lose a large part of their interest.

  • They aren't likely to be injected as dependencies into other objects. You won't need to mock up the object in unit tests and thus the common parent abstraction for mock and real class, which is typically embodied in an interface, isn't really needed.

Data Structures within Databases

Your question gets at the impedance mismatch between relational databases and object-oriented development environments like Java.

  • Database tables map relationships between each other using keys; tables have no notion or concept of object-oriented principles like inheritance.

  • Memory resident objects have references to map relationships and use composition and inheritance to build hierarchies and graphs of objects. Hierarchies and composition are not native concepts in relational databases.

A straightforward way to get at your problem is to use an ORM framework to take these impedance mismatch issues out of your problem domain.

  • Develop your domain object model as you would for any object-oriented application (i.e., using composition and inheritance)

  • When complete, map your objects with a JPA framework such as Hibernate, which will do the ORM for you

  • As long as your domain object model is well-formed and the JPA mapping is correct, then JPA will take care of the problem of mapping your objects into a database and vice-versa.

ORM isn't magic. It won't take a poorly conceived or suboptimally conceived domain object model and fix its problems in the database layer. It really does pay off to spend time getting the domain object model right, tight, concise, and efficient before you begin the work of using it with JPA.

What are the best scenarios for using object-oriented databases?

Read mostly (or single user); simple relations; constrained object graph depth; constraint management non-requirement; and relaxed object identity use cases favor the oo db.

Web Development - Object db vs Relational db

Relational database:

Pros:

  • Established technology - lots of
    tools, developers, resources
  • Wide range of Open Source and commercial
    products
  • Known to scale to very large
    sites, and very high throughput
  • Expresses many problem domains in a logical and "programmable" way
  • Fairly standard language (SQL)

Cons:

  • Impedance mismatch with OO concepts - modeling "inheritance" in a database is not natural
  • Hierarchical structures usually require vendor-specific extensions to the language
  • Non-relational data (e.g. documents) are not a natural fit
  • Changes in the business domain can be hard to implement once the schema has been defined

OOBDMS

Pros:

  • Closer fit for OO concepts
  • In theory, a developer only needs to work in one language - the persistence details are abstracted away. This should improve productivity

Cons:

  • Significantly fewer tools/resources/developers available.
  • No widely accepted standards
  • "black box" approach to persistence can make performance tuning difficult
  • persistence details often leak into the OO design (see Marcelo's examples)


Related Topics



Leave a reply



Submit