SQL Database Design Best Practice (Addresses)

SQL Database Design Best Practice (Addresses)

You're on the right track by breaking address out into its own table. I'd add a couple of additional suggestions.

  1. Consider taking the Address FK columns out of the Customers/Orders tables and creating junction tables instead. In other words, treat Customers/Addresses and Orders/Addresses as many-to-many relationships in your design now so you can easily support multiple addresses in the future. Yes, this means introducing more tables and joins, but the flexibility you gain is well worth the effort.

  2. Consider creating lookup tables for city, state and country entities. The city/state/country columns of the address table then consist of FKs pointing to these lookup tables. This allows you to guarantee consistent spellings across all addresses and gives you a place to store additional metadata (e.g., city population) if needed in the future.

Is there common street addresses database design for all addresses of the world?

It is possible to represent addresses from lots of different countries in a standard set of fields. The basic idea of a named access route (thoroughfare) which the named or numbered buildings are located on is fairly standard, except in China sometimes. Other near universal concepts include: naming the settlement (city/town/village), which can be generically referred to as a locality; naming the region and assigning an alphanumeric postcode. Note that postcodes, also known as zip codes, are purely numeric only in some countries. You will need lots of fields if you really want to be generic.

The Universal Postal Union (UPU) provides address data for lots of countries in a standard format. Note that the UPU format holds all addresses (down to the available field precision) for a whole country, it is therefore relational. If storing customer addresses, where only a small fraction of all possible addresses will be stored, its better to use a single table (or flat format) containing all fields and one address per row.

A reasonable format for storing addresses would be as follows:

  • Address Lines 1-4
  • Locality
  • Region
  • Postcode (or zipcode)
  • Country

Address lines 1-4 can hold components such as:

  • Building
  • Sub-Building
  • Premise number (house number)
  • Premise Range
  • Thoroughfare
  • Sub-Thoroughfare
  • Double-Dependent Locality
  • Sub-Locality

Frequently only 3 address lines are used, but this is often insufficient. It is of course possible to require more lines to represent all addresses in the official format, but commas can always be used as line separators, meaning the information can still be captured.

Usually analysis of the data would be performed by locality, region, postcode and country and these elements are fairly easy for users to understand when entering data. This is why these elements should be stored as separate fields. However, don't force users to supply postcode or region, they may not be used locally.

Locality can be unclear, particularly the distinction between map locality and postal-locality. The postal locality is the one deemed by a postal authority which may sometimes be a nearby large town. However, the postcode will usually resolve any problems or discrepancies there, to allow correct delivery even if the official post-locality is not used.

How to best represent addresses in a database

A DBA where I used to work told me this gem, and it's worked great for us (the first two steps are the same as in your solution):

  1. Remove AddressLine1, AddressLine2, City, StateOrProvince, PostalCode from Customers, Contacts, Facilities and Clients.
  2. Create AddressTypes table with fields AddressTypeID, AddressTypeName, AddressTypeDescription, AddressTypeActive, LastUpdateUser, LastUpdateTime
  3. Create Addresses table with fields AddressID(PK), AddressTypeID(FK), AddressLine1, AddressLine2, City, StateOrProvince, PostalCode , LastUpdateUser, LastUpdateTime, CustomerID(FK), ClientID(FK), ContactID(FK), FacilityID(FK)
  4. On the addresses table, set up a constraint so that only one of the CustomerID, ClientID, ContactID, or FacilityID foreign keys may be non-NULL at a time.

This way you've got all your addresses in one table, they can reference any record you need, your referential integrity is intact, and you don't have in intermediate table that you have to traverse.

The downside is that if you want to add addresses to a new class of object (e.g. an Employee table), you have to add a EmployeeID column to the Addresses table, but that's pretty easy.

Best Practice / Standard for storing an Address in a SQL Database

For international addresses, refer to the Universal Postal Union's Postal Addressing Systems database.

For U.S. addresses, refer to USPS Publication 28 "Postal Addressing Standards".

The USPS wants the following unpunctuated address components concatenated on a single line:

  • house number
  • predirectional (N, SE, etc.)
  • street
  • suffix (AVE, BLVD, etc.)
  • postdirectional (SW, E, etc.)
  • unit (APT, STE, etc.)
  • apartment/suite number

E.g. 102 N MAIN ST SE APT B

If you keep the entire address line as a single field in your database, input and editing is easy, but searches can be more difficult (eg, in the case SOUTH EAST LANE is the street EAST as in S EAST LN or is it LANE as in SE LANE ST?).

If you keep the address parsed into separate fields, searches for components like street name or apartments become easier, but you have to append everything together for output, you need CASS software to parse correctly, and PO boxes, rural route addresses, and APO/FPO addresses have special parsings.

A physical location with multiple addresses at that location is either a multiunit building, in which case letters/numbers after units like APT and STE designate the address, or it's a Commercial Mail Receiving Agency (eg, UPS store) and a maildrop/private mailbox number is appended (like 100 MAIN ST STE B PMB 102), or it's a business with one USPS delivery point and mail is routed after USPS delivery (which usually requires a separate mailstop field which the company might need but the USPS won't want on the address line).

A contact with more than one physical address is usually a business or person with a street address and a PO box. Note that it's common for each address to have a different ZIP code.

It's quite typical that one business transaction might have a shipping address and a billing address (again, with different ZIP codes). The information I keep for EACH address is:

  • name prefix (DR, MS, etc)
  • first name and initial
  • last name
  • name suffix (III, PHD, etc)
  • mail stop
  • company name
  • address (one line only per Pub 28 for USA)
  • city
  • state/province
  • ZIP/postal code
  • country

I typically print mail stops somewhere between the person's name and company because the country contains the state/ZIP which contains the city which contains the address which contains the company which contains the mail stop which contains the person. I use CASS software to validate and standardize addresses when entered or edited.



Related Topics



Leave a reply



Submit