Postgresql Prefix Wildcard for Full Text

Postgresql prefix wildcard for full text

Full text search is good for finding words, not substrings.

For substring searches you'd better use like '%don%' with pg_trgm extension available from PostgreSQL 9.1 and using gin (column_name gin_trgm_ops) or using gist (column_name gist_trgm_ops) indexes. But your index would be very big (even several times bigger than your table) and write performance not very good.

There's a very good example of using pg_trgm for substring search on select * from depesz blog.

How to append prefix match to tsquery in PostgreSQL

You can make the last lexeme in a tsquery a prefix match by casting it to a string, appending ':*', then casting it back to a tsquery:

=> SELECT ((to_tsquery('foo <-> bar')::text || ':*')::tsquery);
      tsquery      
-------------------
 'foo' <-> 'bar':*

For your usecase, you'll want to use <-> instead of & to require the words to be next to each other. Here's a demonstration of how they're different:

=> SELECT 'foo bar baz' @@ tsquery('foo & baz');
 ?column? 
----------
 t
(1 row)

=> SELECT 'foo bar baz' @@ tsquery('foo <-> baz');
 ?column? 
----------
 f
(1 row)

phraseto_tsquery makes it easy to have specify many words that have to be next to each other:

=> SELECT phraseto_tsquery('foo baz');
 phraseto_tsquery 
------------------
 'foo' <-> 'baz'

Putting it all together:

=> SELECT (phraseto_tsquery('The fat ra')::text || ':*')::tsquery;
     tsquery      
------------------
 'fat' <-> 'ra':*

Depending on your needs, a simpler way might be to build a tsquery directly with a string then a cast:

=> SELECT $$'fat' <-> 'ra':*$$::tsquery;
     tsquery      
------------------
 'fat' <-> 'ra':*

Postgresql full text search part of words

Sounds like you simply want wildcard matching.

One option, as previously mentioned is trigrams. My (very) limited experience with it was that it was too slow on massive tables for my liking (some cases slower than a LIKE). As I said, my experience with trigrams is limited, so I might have just been using it wrong.
A second option you could use is the wildspeed module: http://www.sai.msu.su/~megera/wiki/wildspeed
(you'll have to build & install this tho).

The 2nd option will work for suffix/middle matching as well. Which may or may not be more than you're looking for.

There are a couple of caveats (like size of the index), so read through that page thoroughly.

PostgreSQL: Full Text Search - How to search partial words?

Even using LIKE you will not be able to get 'squirrel' from squire% because 'squirrel' has two 'r's. To get Squire and Squirrel you could run the following query:

SELECT title FROM movies WHERE vectors @@ to_tsquery('squire|squirrel');

To differentiate between movies and tv shows you should add a column to your database. However, there are many ways to skin this cat. You could use a sub-query to force postgres to first find the movies matching 'squire' and 'squirrel' and then search that subset to find titles that begin with a '"'. It is possible to create indexes for use in LIKE '"%...' searches.

Without exploring other indexing possibilities you could also run these - mess around with them to find which is fastest:

SELECT title 
FROM (
   SELECT * 
   FROM movies 
   WHERE vectors @@ to_tsquery('squire|squirrel')
) t
WHERE title ILIKE '"%';

SELECT title 
FROM movies 
WHERE vectors @@ to_tsquery('squire|squirrel') 
  AND title ILIKE '"%';

Postgresql Prefix Wildcard for Full Text