So I have a database of products. These often may contain things people are looking for, such as: $100 (price) 3.5" (3.5" hard drive) 6' (6 foot cable) And it goes on with many other potential cases. My current implementation is a fulltext which is very problematic since all 3 examples contain stopwords breaking up your search. If I did a LIKE-type search, it would be more inefficient and difficult to sort the results. For the most parts, I am dealing with these types of searches rather than english language text searches. Any ideas how to better approach this problem with searching the data?
Its difficult especially when you start getting a lot of rows... this solution isnt suitable for everyone but... I created my own utility to split the strings into seperate words on the spaces (I also remove any non-AZ# characters during the process) - and then insert these into their own table of keywords. this then syncronises every hour with my products table where the data has been updated or added.. My table structure is something like this CREATE TABLE searchkeywords (productid int, keyword varchar(50)) Code (markup): When i loop aorund all my rows to populate the table I join the fields together that I want to be searchable. When someone searches in my app - I do the same split (on the spaces), removing the non A-Z chars, then match each word to my new table of keywords. This solution is pretty fast with correct indexes until you get upwards of 250 million rows. Unfortunately this is not a native mysql function but its the only way I have found to provision accurate searching on stop words or symbols.. Taking your example of 3.5" hard drive this becomes insert into searchkeywords (productid ,keyword) values (1,'35') insert into searchkeywords (productid ,keyword) values (1,'hard') insert into searchkeywords (productid ,keyword) values (1,'drive') Code (markup): then to find it: when someone searchs 3.5" hard drive you turn this search string into '35','hard','drive' select p.productid from products p inner join searchkeywords k1 on k1.productid = p.productid AND k1.keyword = '35' inner join searchkeywords k2 on k2.productid = p.productid AND k2.keyword = 'hard' inner join searchkeywords k3 on k3.productid = p.productid AND k3.keyword = 'drive' group by p.productid Code (markup): The downsides are its slow when you get lots of rows, and does not automatically syncronise with your data...
The Google search engine define some search rules to make search any content easier to the user if you add some symbol on your search basically Google ignore these symbols if it is necessary to find these symbols in your search result for this you can use "content ".