Full Text Search Example in Android

Full text search example in Android

Most Basic Answer

I'm using the plain sql below so that everything is as clear and readable as possible. In your project you can use the Android convenience methods. The db object used below is an instance of SQLiteDatabase.

Create FTS Table

db.execSQL("CREATE VIRTUAL TABLE fts_table USING fts3 ( col_1, col_2, text_column )");

This could go in the onCreate() method of your extended SQLiteOpenHelper class.

Populate FTS Table

db.execSQL("INSERT INTO fts_table VALUES ('3', 'apple', 'Hello. How are you?')");
db.execSQL("INSERT INTO fts_table VALUES ('24', 'car', 'Fine. Thank you.')");
db.execSQL("INSERT INTO fts_table VALUES ('13', 'book', 'This is an example.')");

It would be better to use SQLiteDatabase#insert or prepared statements than execSQL.

Query FTS Table

String[] selectionArgs = { searchString };
Cursor cursor = db.rawQuery("SELECT * FROM fts_table WHERE fts_table MATCH ?", selectionArgs);

You could also use the SQLiteDatabase#query method. Note the MATCH keyword.

Fuller Answer

The virtual FTS table above has a problem with it. Every column is indexed, but this is a waste of space and resources if some columns don't need to be indexed. The only column that needs an FTS index is probably the text_column.

To solve this problem we will use a combination of a regular table and a virtual FTS table. The FTS table will contain the index but none of the actual data from the regular table. Instead it will have a link to the content of the regular table. This is called an external content table.

Sample Image

Create the Tables

db.execSQL("CREATE TABLE example_table (_id INTEGER PRIMARY KEY, col_1 INTEGER, col_2 TEXT, text_column TEXT)");
db.execSQL("CREATE VIRTUAL TABLE fts_example_table USING fts4 (content='example_table', text_column)");

Notice that we have to use FTS4 to do this rather than FTS3. FTS4 is not supported in Android before API version 11. You could either (1) only provide search functionality for API >= 11, or (2) use an FTS3 table (but this means the database will be larger because the full text column exists in both databases).

Populate the Tables

db.execSQL("INSERT INTO example_table (col_1, col_2, text_column) VALUES ('3', 'apple', 'Hello. How are you?')");
db.execSQL("INSERT INTO example_table (col_1, col_2, text_column) VALUES ('24', 'car', 'Fine. Thank you.')");
db.execSQL("INSERT INTO example_table (col_1, col_2, text_column) VALUES ('13', 'book', 'This is an example.')");

(Again, there are better ways in do inserts than with execSQL. I am just using it for its readability.)

If you tried to do an FTS query now on fts_example_table you would get no results. The reason is that changing one table does not automatically change the other table. You have to manually update the FTS table:

db.execSQL("INSERT INTO fts_example_table (docid, text_column) SELECT _id, text_column FROM example_table");

(The docid is like the rowid for a regular table.) You have to make sure to update the FTS table (so that it can update the index) every time you make a change (INSERT, DELETE, UPDATE) to the external content table. This can get cumbersome. If you are only making a prepopulated database, you can do

db.execSQL("INSERT INTO fts_example_table(fts_example_table) VALUES('rebuild')");

which will rebuild the whole table. This can be slow, though, so it is not something you want to do after every little change. You would do it after finishing all the inserts on the external content table. If you do need to keep the databases in sync automatically, you can use triggers. Go here and scroll down a little to find directions.

Query the Databases

String[] selectionArgs = { searchString };
Cursor cursor = db.rawQuery("SELECT * FROM fts_example_table WHERE fts_example_table MATCH ?", selectionArgs);

This is the same as before, except this time you only have access to text_column (and docid). What if you need to get data from other columns in the external content table? Since the docid of the FTS table matches the rowid (and in this case _id) of the external content table, you can use a join. (Thanks to this answer for help with that.)

String sql = "SELECT * FROM example_table WHERE _id IN " +
"(SELECT docid FROM fts_example_table WHERE fts_example_table MATCH ?)";
String[] selectionArgs = { searchString };
Cursor cursor = db.rawQuery(sql, selectionArgs);

Further Reading

Go through these documents carefully to see other ways of using FTS virtual tables:

  • SQLite FTS3 and FTS4 Extensions (SQLite docs)
  • Storing and Searching for Data (Android docs)

Additional Notes

  • Set operators (AND, OR, NOT) in SQLite FTS queries have Standard Query Syntax and Enhanced Query Syntax. Unfortunately, Android apparently does not support the Enhanced Query Syntax (see here, here, here, and here). That means mixing AND and OR becomes difficult (requiring the use of UNION or checking PRAGMA compile_options it seems). Very unfortunate. Please add a comment if there is an update in this area.

Android Full Text Search

The easiest way to provide Full Text Search is to use FTS3 in SQLite

Is there anyway to implement Full Text Search (FTS) in SQlite from Android platform?

Full text search in SQLite is supported in Android. You can see an example of it being used in my application here:

http://github.com/bpellin/keepassdroid/blob/master/src/com/keepassdroid/search/SearchDbHelper.java

what is the best way to full text search in a large Array of String in android?

I don't think you should hold 30.000 string of +/- 10 lines in memory. As CommonsWare told before this would put the heap under a lot of (unnecessary) preassure.

A few possabilities come to mind to store the strings and make them accessable for string comparison. You could store the string in a database and use full text search options. But you could also save tham in a file (for example a JSON file) and use a buffered stream in order to compare the strings.

Android - Full text search using existing database

The CREATE VIRTUAL TABLE and INSERTS can be applied to the pre-loaded database the same as a normal table. The database will then need to be re-packaged.

Full text search on a mobile device?

Just recently I had the same issue. Here is what I did:

I created a class to hold just an id and the text for each object (in my case I called it a sku (item number) and a description). This creates a smaller object that uses less memory since it is only used for searching. I'll still grab the full-blown objects from the database after I find matches.

public class SmallItem
{
private int _sku;
public int Sku
{
get { return _sku; }
set { _sku = value; }
}

// Size of max description size + 1 for null terminator.
private char[] _description = new char[36];
public char[] Description
{
get { return _description; }
set { _description = value; }
}

public SmallItem()
{
}
}

After this class is created, you can then create an array (I actually used a List in my case) of these objects and use it for searching throughout your application. The initialization of this list takes a bit of time, but you only need to worry about this at start up. Basically just run a query on your database and grab the data you need to create this list.

Once you have a list, you can quickly go through it searching for any words you want. Since it's a contains, it must also find words within words (e.g. drill would return drill, drillbit, drills etc.). To do this, we wrote a home-grown, unmanaged c# contains function. It takes in a string array of words (so you can search for more than one word... we use it for "AND" searches... the description must contain all words passed in... "OR" is not currently supported in this example). As it searches through the list of words it builds a list of IDs, which are then passed back to the calling function. Once you have a list of IDs, you can easily run a fast query in your database to return the full-blown objects based on a fast indexed ID number. I should mention that we also limit the maximum number of results returned. This could be taken out. It's just handy if someone types in something like "e" as their search term. That's going to return a lot of results.

Here's the example of custom Contains function:

public static int[] Contains(string[] descriptionTerms, int maxResults, List<SmallItem> itemList)
{
// Don't allow more than the maximum allowable results constant.
int[] matchingSkus = new int[maxResults];

// Indexes and counters.
int matchNumber = 0;
int currentWord = 0;
int totalWords = descriptionTerms.Count() - 1; // - 1 because it will be used with 0 based array indexes

bool matchedWord;

try
{
/* Character array of character arrays. Each array is a word we want to match.
* We need the + 1 because totalWords had - 1 (We are setting a size/length here,
* so it is not 0 based... we used - 1 on totalWords because it is used for 0
* based index referencing.)
* */
char[][] allWordsToMatch = new char[totalWords + 1][];

// Character array to hold the current word to match.
char[] wordToMatch = new char[36]; // Max allowable word size + null terminator... I just picked 36 to be consistent with max description size.

// Loop through the original string array or words to match and create the character arrays.
for (currentWord = 0; currentWord <= totalWords; currentWord++)
{
char[] desc = new char[descriptionTerms[currentWord].Length + 1];
Array.Copy(descriptionTerms[currentWord].ToUpper().ToCharArray(), desc, descriptionTerms[currentWord].Length);
allWordsToMatch[currentWord] = desc;
}

// Offsets for description and filter(word to match) pointers.
int descriptionOffset = 0, filterOffset = 0;

// Loop through the list of items trying to find matching words.
foreach (SmallItem i in itemList)
{
// If we have reached our maximum allowable matches, we should stop searching and just return the results.
if (matchNumber == maxResults)
break;

// Loop through the "words to match" filter list.
for (currentWord = 0; currentWord <= totalWords; currentWord++)
{
// Reset our match flag and current word to match.
matchedWord = false;
wordToMatch = allWordsToMatch[currentWord];

// Delving into unmanaged code for SCREAMING performance ;)
unsafe
{
// Pointer to the description of the current item on the list (starting at first char).
fixed (char* pdesc = &i.Description[0])
{
// Pointer to the current word we are trying to match (starting at first char).
fixed (char* pfilter = &wordToMatch[0])
{
// Reset the description offset.
descriptionOffset = 0;

// Continue our search on the current word until we hit a null terminator for the char array.
while (*(pdesc + descriptionOffset) != '\0')
{
// We've matched the first character of the word we're trying to match.
if (*(pdesc + descriptionOffset) == *pfilter)
{
// Reset the filter offset.
filterOffset = 0;

/* Keep moving the offsets together while we have consecutive character matches. Once we hit a non-match
* or a null terminator, we need to jump out of this loop.
* */
while (*(pfilter + filterOffset) != '\0' && *(pfilter + filterOffset) == *(pdesc + descriptionOffset))
{
// Increase the offsets together to the next character.
++filterOffset;
++descriptionOffset;
}

// We hit matches all the way to the null terminator. The entire word was a match.
if (*(pfilter + filterOffset) == '\0')
{
// If our current word matched is the last word on the match list, we have matched all words.
if (currentWord == totalWords)
{
// Add the sku as a match.
matchingSkus[matchNumber] = i.Sku.ToString();
matchNumber++;

/* Break out of this item description. We have matched all needed words and can move to
* the next item.
* */
break;
}

/* We've matched a word, but still have more words left in our list of words to match.
* Set our match flag to true, which will mean we continue continue to search for the
* next word on the list.
* */
matchedWord = true;
}
}

// No match on the current character. Move to next one.
descriptionOffset++;
}

/* The current word had no match, so no sense in looking for the rest of the words. Break to the
* next item description.
* */
if (!matchedWord)
break;
}
}
}
}
};

// We have our list of matching skus. We'll resize the array and pass it back.
Array.Resize(ref matchingSkus, matchNumber);
return matchingSkus;
}
catch (Exception ex)
{
// Handle the exception
}
}

Once you have the list of matching skus, you can iterate through the array and build a query command that only returns the matching skus.

For an idea of performance, here's what we have found (doing the following steps):

  • Search ~171,000 items
  • Create list of all matching items
  • Query the database, returning only the matching items
  • Build full-blown items (similar to SmallItem class, but a lot more fields)
  • Populate a datagrid with the full-blow item objects.

On our mobile units, the entire process takes 2-4 seconds (takes 2 if we hit our match limit before we have searched all items... takes 4 seconds if we have to scan every item).

I've also tried doing this without unmanaged code and using String.IndexOf (and tried String.Contains... had same performance as IndexOf as it should). That way was much slower... about 25 seconds.

I've also tried using a StreamReader and a file containing lines of [Sku Number]|[Description]. The code was similar to the unmanaged code example. This way took about 15 seconds for an entire scan. Not too bad for speed, but not great. The file and StreamReader method has one advantage over the way I showed you though. The file can be created ahead of time. The way I showed you requires the memory and the initial time to load the List when the application starts up. For our 171,000 items, this takes about 2 minutes. If you can afford to wait for that initial load each time the app starts up (which can be done on a separate thread of course), then searching this way is the fastest way (that I've found at least).

Hope that helps.

PS - Thanks to Dolch for helping with some of the unmanaged code.

full text search android

Just need to add "%" before name .

Cursor cursor = DataBaseUtil.dataBaseAccess().query(
_PROJECT_TABLE,
_FIELD_NAME,
"(" + _NAME + ")" + " LIKE " + "(" + "\"" + "%"
+ name + "%" + "\"" + ")", null, null, null, _MODIFIED_DATE);

Or

Try this.

return db.rawQuery("select * from your_table_name where column_name like ?", new String[] {"%"+name+"%"});

It return Cursor.



Related Topics



Leave a reply



Submit