Note: If your don’t need to store keyword frequency, then go with Marmik Bhatt’s LIKE suggestion.
If you have large amount of data and you want to do a keyword search only (i.e. you are not going to be searching for phrases or use concepts like “near”), then you can simply create a keyword table:
CREATE TABLE address
id INT(10) PRIMARY KEY,
/* ... */
CREATE TABLE keyword
PRIMARY KEY(word, article_id)
You then scan through the text that you are “indexing” and count each word that you find there.
If you want to do several keywords:
SELECT address.*, SUM(frequency) frequency_sum
INNER JOIN keyword ON keyword.address_id = address.id
WHERE keyword.word IN ('keyword1', 'keyword2', /*...*/)
GROUP BY address.id;
Here i’ve done a frequency sum, which can be a dirty way to compare the usefulness of the result, when many are given.
Things to think about:
- Do you want to insert all keywords into the database, or only those, that have a frequency higher than a specific value? If you insert all your table may become huge, if you insert only higher frequency ones, then you will not find the only article that mentions a specific word, but does so only once.
- Do you want to insert all the available keywords for the specific article or only “top ones”? In this case the danger is that frequent words that add nothing to the meaning will begin pushing others out. Consider the word “However”, it may be in your article many more times than “mysql”, buy it is the latter that defines the article, not the former.
- Do you want to exclude words shorter then a specific length of characters?
- Do you want to exclude known “meaningless” words?