Best database field type for a URL
- Lowest common denominator max URL length among popular web browsers: 2,083 (Internet Explorer)
- http://dev.mysql.com/doc/refman/5.0/en/char.html
Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 255 before MySQL 5.0.3, and 0 to 65,535 in 5.0.3 and later versions. The effective maximum length of a VARCHAR in MySQL 5.0.3 and later is subject to the maximum row size (65,535 bytes, which is shared among all columns) and the character set used.
- So ...
< MySQL 5.0.3 use TEXT
or
>= MySQL 5.0.3 use VARCHAR(2083)
What is the best datatype for storing URLs in a MySQL database?
If by "links" you mean links to web pages, I'm guessing you want to store URLs.
Since URLs are variable length strings the VARCHAR
data type would seem the obvious choice.
What data type does a URL correspond to in MySQL?
Simply put the data type should be VARCHAR
URLs can contain any number of characters, and can be any length (within reason on the smaller end). A CHAR
field can only contain the number of characters that is set in the table definition. A VariableCharacter (VARCHAR
) field can contain a variable number of characters. So since not all URL's are of equal length you need the variability. You could make an argument to use a TEXT
field if you needed to store really long URLs; however, for most use cases VARCHAR
will suffice.
MySQL datatype for URL's
I would use a generic VARCHAR(255)
http://dev.mysql.com/doc/refman/5.5/en/char.html
How to store URLs in MySQL
According to the DNS spec the maximum length of the domain name is :
The DNS itself places only one restriction on the particular labels
that can be used to identify resource records. That one restriction
relates to the length of the label and the full name. The length of
any one label is limited to between 1 and 63 octets. A full domain
name is limited to 255 octets (including the separators).
255 * 3 = 765 < 767 (Just barely :-) )
However notice that each component can only be 63 characters long.
So I would suggest chopping the url into the component bits.
Using http://foo.example.com/a/really/long/path?with=lots&of=query¶meters=that&goes=on&forever&and=ever
Probably this would be adequate:
- protocol flag ["http" -> 0 ] ( store "http" as 0, "https" as 1, etc. )
- subdomain ["foo" ] ( 255 - 63 = 192 characters : I could subtract 2 more because min tld is 2 characters )
- domain ["example"], ( 63 characters )
- tld ["com"] ( 4 characters to handle "info" tld )
- path [ "a/really/long/path" ] ( as long as you want -store in a separate table)
- queryparameters ["with=lots&of=query¶meters=that&goes=on&forever&and=ever" ] ( store in a separate key/value table )
- portnumber / authentication stuff that is rarely used can be in a separate keyed table if actually needed.
This gives you some nice advantages:
- The index is only on the parts of the url that you need to search on (smaller index! )
- queries can be limited to the various url parts ( find every url in the facebook domain for example )
- anything url that has too long a subdomain/domain is bogus
- easy to discard query parameters.
- easy to do case insensitive domain name/tld searching
- discard the syntax sugar ( "://" after protocol, "." between subdomain/domain, domain/tld, "/" between tld and path, "?" before query, "&" "=" in the query)
- Avoids the major sparse table problem. Most urls will not have query parameters, nor long paths. If these fields are in a separate table then your main table will not take the size hit. When doing queries more records will fit into memory, therefore faster query performance.
- (more advantages here).
What is the best column type for URL?
If you are prepared to always URL encode your URLs before you store them (an example turned up by Google was 中.doc URL encoding to %E4%B8%AD.doc) then you are safe sticking with varchar. If you want the non-ASCII characters in your URLs to remain readable in the database then I'd recommend nvarchar. If you don't want to be caught out, then go for nvarchar.
Since IE (the most restrictive of the mainstream browsers) doesn't support URLs longer than 2083 characters, then (apart from any considerations you might have on indexing or row length), you can cover most useful scenarios with nvarchar(2083).
What's the proper column type to save urls in MySQL?
Use TEXT
, it's enough for every URL
.
Note that with long URL
s, you won't be able to create an index that covers the whole URL
. If you need a UNIQUE
index, you should calculate the URL
hash, store the hash separately and index the hash instead.
Related Topics
Sql to Generate a List of Numbers from 1 to 100
Display Each Department's Number and Name and the Number of Employees Employed in Each Department
How to Get Last 7 Days Data from Current Datetime to Last 7 Days in SQL Server
Select Distinct Values from One Table and Join With Another Table
How to Convert Milliseconds to Time(Hh:Mm:Ss) in Oracle
How to Merge Multiple Rows into Single in Oracle
Add Single Quotes to Results in a Column from a SQL Query
Count All Records Per Day in a Specific Month
How to Check If Value Is Inserted Successfully or Not
How to Get the Last 12 Months from the Current Date
Solve Query for Showing Top 5 Selling Products
Update Only Time from My Datetime Field in SQL
A SQL Query to Select a String Between Two Known Strings
How to Select Oldest Date from MySQL
Calculating Age Derived from Current Date and Dob
Remove Blank Line in Between Select Queries When Spooling to CSV File