The ranking may be obtained a ranking value based on the algorithm by combining the "hit tags, scores, and weights of the tags". selectĪfter which, let's look at Refined ranking. Then, when it comes to Full-text search, I recommend that you reference my article Using PostgreSQL to Create an Efficient Search Engine. Now, you'll want to create a tsvector expression index create index idx_tbl_1 on tbl using gin (text_to_tsvector(tags)) īelow is the UDF for fetching the weights of the hit tags postgres=# select substring('bc:1.1,abc:100,c:293','?abc:(+)') Select array_to_tsvector(array_agg(substring(id,'(.+):'))) from unnest(regexp_split_to_array($1, ',')) as t(id) Next, there's a UDF to convert the text to the tag array and the tsvector: create or replace function text_to_tsvector(text) returns tsvector as Now consider the tag weight table: create table tbl_weight (Ĭreate index idx_tbl_weight_1 on tbl_weight (tagname) Consider the following example: national_foot baths:0.99,national_dining:0.1,entertaining_KTV:0.45 If the tsvector is used, it will be our best choice then to use the PostgreSQL full-text search syntax, which, of course, contains things like Contain, Notcontain, and Distance, as we discussed in the previous article. For our user-defined function, an array to store tags or a tsvector index can be used. That is, we'll make our own custom function. Tags text - Multi-value type, lavel 1: score 1, label 2: score 2, …įor the tags field, we will be using a user-defined function (UDF) index. For this example, first look at the shop tag table: create table tbl ( In this article, we will explore how you can refine the ranking of shops from the e-commerce example above through two different methods, using tsvector and using a multi-dimensional array, to customize rankings. In this case, the way to refine the ranking is shown. During the search, some of the shop's tags are hit, and millions of shops may be searched, but finally, the way to rank them should be computed based on the weight because 10,000 of them can be obtained. Coefficients can be dynamically adjusted. Rather, a system based on these tags and the corresponding coefficients is more suitable. It is reasonable to say that the four levels system is insufficient here. In the e-commerce industry, you can store tags for each shop, and each tag can have a corresponding coefficient. So the question remains, is there a way for a more-refined ranking algorithm that can implement more levels in PostgreSQL? Is it possible to customize ranking, that is?Īlso, consider the following real-world example. Four levels is far from meeting all business needs. And, you can also set the correction mask of ranking. For these four levels, you can set the corresponding weight for ranking computing. Also, in that same previous article, we also discussed how tsvector divides the document into four levels: title, author, abstract, and content. We will be discussing these algorithms and other related topics more in this article. In my previous article in this series, Using PostgreSQL to Create an Efficient Search Engine, specifically in the section "Sorting Algorithms", I previously discussed the PostgreSQL ranking algorithm.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |