Why Is It Best to Store a Telephone Number as a String VS. Integer

Why is it best to store a telephone number as a string vs. integer?

Telephone numbers are strings of digit characters, they are not integers.

Consider for example:

  • Expressing a telephone number in a different base would render it meaningless

  • Adding or multiplying two telephone numbers together, or any math operation on a phone number, is meaningless. The result is not another telephone number (except by conicidence)

  • Telephone numbers are intended to be entered "as-is" into a connected device.

  • Telephone numbers may have leading zeroes.

  • Manipulations of telephone numbers, such as adding an area code, are String operations.

Storing the string version of the telephone number makes this clear and unambiguous.


History: On old pulse-encoded dial systems, the code for each digit in a telephone number was sent as the same number of pulses as the digit (or 10 pulses for "0"). That may be why we still use digits to represent the parts of a phone number. See http://en.wikipedia.org/wiki/Pulse_dialing

Should I store telephone numbers as strings or integers?

For any situation like these, think of : will I have to calculate anything with that value? If that doesn't make any sense, you should use a string. In that case, there's no logical case where you'd use the telephone number as a number, so use a string.

phone number should be a string or some numeric type that have capacity to save phone number?

ITU-T recommendation E.164 says you need 3 digits for the country code and up to 15 digits for the directory number within the country dialing plan.

And, many people add some punctuation. For example:

+1.212.555.1212 is a North American number. It could also be rendered
(212) 555-1212 in a North American centric app.

32 characters of text should do the trick worldwide.

DO NOT use a number, or you'll be sorry. I was: two things.

  • Lost some European business for a company because we assumed all phone numbers were NANP-compliant ten-digit numbers.
  • A spreadsheet export rendered the numbers in scientific notation 2.12555E+09 That's almost as stupid as SIRI telling me you have call from two bllion, one hundred twenty five million....

Telephone directory numbers are not numeric data types. Take a look at this: Falsehoods Programmers Believe About Telephone Numbers.

What's the right way to represent phone numbers?

Use String. Aside from anything else, you won't be able to store leading zeroes if you use integers. You definitely shouldn't use int (too small) float or double (too much risk of data loss - see below); long or BigInteger could be appropriate (aside from the leading zeroes problem), but frankly I'd go with String. That way you can also store whatever dashes or spaces the user has entered to make it easier to remember the number, if you want to.

In terms of the "data loss" mentioned above for float and double - float definitely doesn't have enough precision; double could work if you're happy that you'll never need more than 16 digits (a couple fewer than you get with long) but you would need to be very, very careful that anywhere you convert the value back from double to string, you got the exact value. Many formatting conversions will give you an approximation which may be accurate to, say, 10 significant digits - but you'd want an exact integer. Basically, using floating point for phone numbers is a fundamentally bad idea. If you have to use a fixed-width numeric type, use a long, but ideally, avoid it entirely.

What datatype should be used for storing phone numbers in SQL Server 2005?

Does this include:

  • International numbers?
  • Extensions?
  • Other information besides the actual number (like "ask for bobby")?

If all of these are no, I would use a 10 char field and strip out all non-numeric data. If the first is a yes and the other two are no, I'd use two varchar(50) fields, one for the original input and one with all non-numeric data striped and used for indexing. If 2 or 3 are yes, I think I'd do two fields and some kind of crazy parser to determine what is extension or other data and deal with it appropriately. Of course you could avoid the 2nd column by doing something with the index where it strips out the extra characters when creating the index, but I'd just make a second column and probably do the stripping of characters with a trigger.

Update: to address the AJAX issue, it may not be as bad as you think. If this is realistically the main way anything is done to the table, store only the digits in a secondary column as I said, and then make the index for that column the clustered one.

Integer vs String in database

In my country, post-codes are also always 4 digits. But the first digit can be zero.

If you store "0700" as an integer, you can get a lot of problems:

  • It may be read as an octal value
  • If it is read correctly as a decimal value, it gets turned into "700"
  • When you get the value "700", you must remember to add the zero
  • I you don't add the zero, later on, how will you know if "700" is "0700", or someone mistyped "7100"?

Technically, our post codes is actually strings, even if it is always 4 digits.

You can store them as integers, to save space. But remember this is a simple DB-trick, and be careful about leading zeroes.

But what about for storing how many
files are in a torrent? Integer or
string?

That's clearly an integer.

Shouldn't a long be big enough to be able to store a phone number?

Telephone numbers aren't numbers. They're strings. For instance, in the UK and many other European countries, the first digit of a full phone number is a 0. But a simple numeric type like long has no way to indicate that a leading 0 is significant, the number 07859 664 443 would be the value 7859664443. The leading zero matters. Separately, it's not at all uncommon to see a number written like this: +44 (0)7859 664 443. Put that in your long and try to smoke it. :-)

Also, the formatting matters. 07859 664 443 is a lot easier to read and dial than 07859664443. In the U.S., they frequently put area codes in parentheses: (800) 123-4567, which again is easier for we poor humans to deal with than 8001234567.

In my experience, the best way to deal with phone numbers is to store them as strings, and largely to leave them the way they were entered, unless you want to limit the application to the phone numbers used in a very limited geographic area — and even then, things can change. Trying to build formatting rules into your application immediately introduces a maintenance item. When (when) the U.S. runs into the limits of its current (xxx) yyy-zzzz format, for example, a huge number of applications are going to need updates. Painful ones. Similarly, I used an application the other day that assumed all UK numbers were in the form (xxxxx) nnn nnn. This is no longer true, large metropolitan areas are now (xxx) nnnn nnnn whereas we more rural types are still on the old system. For a while at the beginning of the last decade, London's numbers were in the form (xxxx) nnn nnnn. You get the idea.

Drawbacks of storing an integer as a string in a database

Unless you really need the features of an integer (that is, the ability to do arithmetic), then it is probably better for you to store the product IDs as strings. You will never need to do anything like add two product IDs together, or compute the average of a group of product IDs, so there is no need for an actual numeric type.

It is unlikely that storing product IDs as strings will cause a measurable difference in performance. While there will be a slight increase in storage size, the size of a product ID string is likely to be much smaller than the data in the rest of your database row anyway.

Storing product IDs as strings today will save you much pain in the future if the data provider decides to start using alphabetic or symbol characters. There is no real downside.



Related Topics



Leave a reply



Submit