Full Text Search (Part 4)

I want to return to yesterday’s topic, and look at further optimizing the Full-Text Search (FTS) query. Although we’ve addressed the gross inefficiencies caused by suboptimal query plans, there is still some fat to trim; it turns out that the sorting of results can be a significant factor in query performance.

NSSortDescriptor

All the queries we saw yesterday included the clause:

ORDER BY t0.ZTEXT COLLATE NSCollateLocaleSensitive

Two things to note here:

  • The “NSCollateLocaleSensitive” string indicates a Core Data defined collation function; this isn’t built into SQLite
  • The queries are being sorted on the text field of Text objects; these fields are large (between 727 and 4154 bytes, average of 2222) and must be pulled into memory to be sorted and compared

It seems unhelpful to sort such large objects, particularly with a custom sort routine. Suppose we added a sortPrefix to Text objects? Then we could change our NSSortDescriptor creation code from:

[[NSSortDescriptor alloc] initWithKey:@"text" ascending:YES selector:@selector(localizedCompare:)]

to:

[[NSSortDescriptor alloc] initWithKey:@"sortPrefix" ascending:YES selector:@selector(localizedCompare:)]

Timing

To maximize the visibility of this change, we’ll have to pick a keyword that returns a lot of hits. (If sorting is a time sink, it will show up most dramatically when there are a lot of rows to sort.) For my test corpus, the word “human” returns 64 blocks of text (as opposed to “destiny”, which returns only 3). So, let’s get timing.

Over 4 queries, here are the average total execution fetch times (on iPhone 3G) for queries that sort on text and queries that sort on sortPrefix:

Sort on text Sort on sortPrefix
ID retrieval 0.6513s 0.2858s
Data retrieval 0.1043s 0.0720s
Total 0.7556s 0.3578s

As you can see, changing the sort clause cuts the average execution time by about 53%, which is more than worthwhile.

Sort Prefix

It only remains to discuss the calculation of the sort prefix itself. My assumption is that the sort order of the leading 20 characters of a text block should be dispositive of the sort order of the blocks as a whole. Therefore, I overrode the Text class’ setText method s.t. it would automatically create a proper sortPrefix:

- (void)setText:(NSString*)newText
{
	[self willChangeValueForKey:@"text"];
	[self setPrimitiveText:newText];
	[self didChangeValueForKey:@"text"];
	
	NSUInteger l = [newText length];
	self.sortPrefix = [newText substringToIndex:MIN(l, 20)];
}
Share and Enjoy:
  • Twitter
  • Facebook
  • Digg
  • Reddit
  • HackerNews
  • del.icio.us
  • Google Bookmarks
  • Slashdot
This entry was posted in iPhone. Bookmark the permalink.

Comments are closed.