What's the Right Way to Represent Phone Numbers

What's the right way to represent phone numbers?

Use String. Aside from anything else, you won't be able to store leading zeroes if you use integers. You definitely shouldn't use int (too small) float or double (too much risk of data loss - see below); long or BigInteger could be appropriate (aside from the leading zeroes problem), but frankly I'd go with String. That way you can also store whatever dashes or spaces the user has entered to make it easier to remember the number, if you want to.

In terms of the "data loss" mentioned above for float and double - float definitely doesn't have enough precision; double could work if you're happy that you'll never need more than 16 digits (a couple fewer than you get with long) but you would need to be very, very careful that anywhere you convert the value back from double to string, you got the exact value. Many formatting conversions will give you an approximation which may be accurate to, say, 10 significant digits - but you'd want an exact integer. Basically, using floating point for phone numbers is a fundamentally bad idea. If you have to use a fixed-width numeric type, use a long, but ideally, avoid it entirely.

Should I store telephone numbers as strings or integers?

For any situation like these, think of : will I have to calculate anything with that value? If that doesn't make any sense, you should use a string. In that case, there's no logical case where you'd use the telephone number as a number, so use a string.

Why is it best to store a telephone number as a string vs. integer?

Telephone numbers are strings of digit characters, they are not integers.

Consider for example:

  • Expressing a telephone number in a different base would render it meaningless

  • Adding or multiplying two telephone numbers together, or any math operation on a phone number, is meaningless. The result is not another telephone number (except by conicidence)

  • Telephone numbers are intended to be entered "as-is" into a connected device.

  • Telephone numbers may have leading zeroes.

  • Manipulations of telephone numbers, such as adding an area code, are String operations.

Storing the string version of the telephone number makes this clear and unambiguous.


History: On old pulse-encoded dial systems, the code for each digit in a telephone number was sent as the same number of pulses as the digit (or 10 pulses for "0"). That may be why we still use digits to represent the parts of a phone number. See http://en.wikipedia.org/wiki/Pulse_dialing

Shouldn't a long be big enough to be able to store a phone number?

Telephone numbers aren't numbers. They're strings. For instance, in the UK and many other European countries, the first digit of a full phone number is a 0. But a simple numeric type like long has no way to indicate that a leading 0 is significant, the number 07859 664 443 would be the value 7859664443. The leading zero matters. Separately, it's not at all uncommon to see a number written like this: +44 (0)7859 664 443. Put that in your long and try to smoke it. :-)

Also, the formatting matters. 07859 664 443 is a lot easier to read and dial than 07859664443. In the U.S., they frequently put area codes in parentheses: (800) 123-4567, which again is easier for we poor humans to deal with than 8001234567.

In my experience, the best way to deal with phone numbers is to store them as strings, and largely to leave them the way they were entered, unless you want to limit the application to the phone numbers used in a very limited geographic area — and even then, things can change. Trying to build formatting rules into your application immediately introduces a maintenance item. When (when) the U.S. runs into the limits of its current (xxx) yyy-zzzz format, for example, a huge number of applications are going to need updates. Painful ones. Similarly, I used an application the other day that assumed all UK numbers were in the form (xxxxx) nnn nnn. This is no longer true, large metropolitan areas are now (xxx) nnnn nnnn whereas we more rural types are still on the old system. For a while at the beginning of the last decade, London's numbers were in the form (xxxx) nnn nnnn. You get the idea.

what primitive or what type should we use to represent a telephone number in java?

Using a string is a very good idea. Remember that the point of OOP is that different types of data have different usage patterns. So let's look at a phone number patterns.

  • Do we add phone numbers or perform other arithmetic on them?
  • Do we split it based on length, match against parts of it, replace parts of it?

The answer to the first question is no. So we don't manipulate them like numbers. The answer to the second question is yes, so we manipulate them like strings.

Now, many here are advocating making phones a class by itself. There is merit in this regard, but I'm addressing the more pressing concern of how do you store the phone number, which is something you need to do no matter if a phone is a class or not. A phone class which stored its data as a number would not be a good fit.

What datatype should be used for storing phone numbers in SQL Server 2005?

Does this include:

  • International numbers?
  • Extensions?
  • Other information besides the actual number (like "ask for bobby")?

If all of these are no, I would use a 10 char field and strip out all non-numeric data. If the first is a yes and the other two are no, I'd use two varchar(50) fields, one for the original input and one with all non-numeric data striped and used for indexing. If 2 or 3 are yes, I think I'd do two fields and some kind of crazy parser to determine what is extension or other data and deal with it appropriately. Of course you could avoid the 2nd column by doing something with the index where it strips out the extra characters when creating the index, but I'd just make a second column and probably do the stripping of characters with a trigger.

Update: to address the AJAX issue, it may not be as bad as you think. If this is realistically the main way anything is done to the table, store only the digits in a secondary column as I said, and then make the index for that column the clustered one.

How to store phone numbers on MySQL databases?

  • All as varchar (they aren't numbers but "collections of digits")
  • Country + area + number separately
  • Not all countries have area code (eg Malta where I am)
  • Some countries drop the leading zero from the area code when dialling internal (eg UK)
  • Format in the client code

Regular expression to match standard 10 digit phone number

^(\+\d{1,2}\s)?\(?\d{3}\)?[\s.-]\d{3}[\s.-]\d{4}$

Matches the following

123-456-7890
(123) 456-7890
123 456 7890
123.456.7890
+91 (123) 456-7890

If you do not want a match on non-US numbers use

^(\+0?1\s)?\(?\d{3}\)?[\s.-]\d{3}[\s.-]\d{4}$

Update :
As noticed by user Simon Weaver below, if you are also interested in matching on unformatted numbers just make the separator character class optional as [\s.-]?

^(\+\d{1,2}\s)?\(?\d{3}\)?[\s.-]?\d{3}[\s.-]?\d{4}$


Related Topics



Leave a reply



Submit