How to implement a short URL like the URLs in Twitter?
The easiest way is to:
- keep a database of all URLs
- when you insert a new URL into the database, find out the id of the auto-incrementing integer primary key.
- encode that integer into base 36 or 62 (digits + lowercase alpha or digits + mixed-case alpha). Voila! You have a short url!
Encoding to base 36/decoding from base 36 is simple in Ruby:
12341235.to_s(36)
#=> "7cik3"
"7cik3".to_i(36)
#=> 12341235
Encoding to base 62 is a bit tricker. Here's one way to do it:
module AnyBase
ENCODER = Hash.new do |h,k|
h[k] = Hash[ k.chars.map.with_index.to_a.map(&:reverse) ]
end
DECODER = Hash.new do |h,k|
h[k] = Hash[ k.chars.map.with_index.to_a ]
end
def self.encode( value, keys )
ring = ENCODER[keys]
base = keys.length
result = []
until value == 0
result << ring[ value % base ]
value /= base
end
result.reverse.join
end
def self.decode( string, keys )
ring = DECODER[keys]
base = keys.length
string.reverse.chars.with_index.inject(0) do |sum,(char,i)|
sum + ring[char] * base**i
end
end
end
...and here it is in action:
base36 = "0123456789abcdefghijklmnopqrstuvwxyz"
db_id = 12341235
p AnyBase.encode( db_id, base36 )
#=> "7cik3"
p AnyBase.decode( "7cik3", base36 )
#=> 12341235
base62 = [ *0..9, *'a'..'z', *'A'..'Z' ].join
p AnyBase.encode( db_id, base62 )
#=> "PMwb"
p AnyBase.decode( "PMwb", base62 )
#=> 12341235
Edit
If you want to avoid URLs that happen to be English words (for example, four-letter swear words) you can use a set of characters that does not include vowels:
base31 = ([*0..9,*'a'..'z'] - %w[a e i o u]).join
base52 = ([*0..9,*'a'..'z',*'A'..'Z'] - %w[a e i o u A E I O U]).join
However, with this you still have problems like AnyBase.encode(328059,base31)
or AnyBase.encode(345055,base31)
or AnyBase.encode(450324,base31)
. You may thus want to avoid vowel-like numbers as well:
base28 = ([*'0'..'9',*'a'..'z'] - %w[a e i o u 0 1 3]).join
base49 = ([*'0'..'9',*'a'..'z',*'A'..'Z'] - %w[a e i o u A E I O U 0 1 3]).join
This will also avoid the problem of "Is that a 0 or an O?" and "Is that a 1 or an I?".
Implement short urls (tinyurls) for twitter in c#?
I just published an article about doing this from bit.ly in a C# application.
Note that bit.ly requires a free login key that you will need in order for the code to work.
Is it possible to shorten url from Twitter API?
It isn't possible to shorten links using t.co through any means other than sending status updates or direct messages via Twitter. From the Twitter support site:
The link service at http://t.co is only used on links posted on Twitter and is not available as a general shortening service.
So, yes, you'll need to use some other shortening service.
How to crawl shortened urls and get the actual domain in python?
In order to extract domain name from the url, besides urlparse, you can use tldextract module:
>>> import tldextract
>>> urls = ['http://news.example.com',
'http://blog.example.com/eeaWdada5das',
'http://example.com/ewdaD585Jz']
>>> for url in urls:
... data = tldextract.extract(url)
... print '{0}.{1}'.format(data.domain, data.suffix)
...
example.com
example.com
example.com
UPD (example for com.mx
):
>>> data = tldextract.extract('http://example.com.mx')
>>> print '{0}.{1}'.format(data.domain, data.suffix)
example.com.mx
Twitter API 1.1 - render twitter's t.co links
Thank you for your answers.
After analyzing the JSON in the suggested link (https://dev.twitter.com/docs/tweet-entities), I wrote a solution to the exposed problem:
// ...
$twitter_data = json_decode($json); // last line of the code in: http://stackoverflow.com/questions/12916539
// print the tweets, with the full URLs:
foreach ($twitter_data as $item) {
$text = $item->text;
foreach ($item->entities->urls as $url) {
$text = str_replace($url->url, $url->expanded_url, $text);
}
echo $text . '<br /><br />';
// optionally, here, the code from: http://stackoverflow.com/questions/15610968/
// can be added, too.
}
Long URLs from Twitter feeds without making additional API calls/HTTP requests
No, Twitter does not offer a urls
entity in its RSS responses, nor does the include_entities
option appear to work. You'll have to use a different response format e.g. JSON (with which you can use the include_entities
option which includes an entities['urls'][n]['expanded_url']
object), or "unshorten" the URLs yourself after the fact.
Twitter auto shorten URL not working
It won't be visibly shortened in the compose window, but the compose window does detect URLs and adjusts the character count accordingly. Try pasting a huge long URL - it'll only use up 22 characters in the count.
Do note that Twitter shortens all URLs, even when "shortening" actually makes them longer. For example, "http://bit.ly" will use up 22 characters (not 19), not 13.
Related Topics
How to Read Lines of a File in Ruby
Difference Between $Stdout and Stdout in Ruby
Ruby Gem For Finding Timezone of Location
Why Do Two Strings Separated by Space Concatenate in Ruby
How to Monitor Delayed_Job With Monit
How to Remove Lines of Data in the Middle of a Text File with Ruby
Rails 'Parse_Query' Error on Server in Brand New App
Best Way to Add Comments in Erb
Automatic Counter in Ruby For Each
How to Install Therubyracer Gem on 10.10 Yosemite
How to Stub Things in Minitest
Set Socket Timeout in Ruby Via So_Rcvtimeo Socket Option
Getting a "Bad Interpreter" Error When Using Brew
Ruby: What Does the Comment "Frozen_String_Literal: True" Do
Netbeans and Rails Error: Bin/Ruby: No Such File or Directory -- Script/Rails (Loaderror)