W3cubDocs

/MariaDB

Mroonga Overview



Once Mroonga has been installed (see About Mroonga), its basic usage is similar to that of a regular fulltext index.



For example:


CREATE TABLE ft_mroonga(copy TEXT,FULLTEXT(copy)) ENGINE=Mroonga;

INSERT INTO ft_mroonga(copy) VALUES ('Once upon a time'),
    ('There was a wicked witch'), ('Who ate everybody up');

SELECT * FROM ft_mroonga WHERE MATCH(copy) AGAINST('wicked');
+--------------------------+
| copy                     |
+--------------------------+
| There was a wicked witch |
+--------------------------+

Score

Mroonga can also order by weighting. For example, first add another record:

INSERT INTO ft_mroonga(copy) VALUES ('She met a wicked, wicked witch');

Records can be returned by weighting, for example, the newly added record has two occurences of the word 'wicked' and a higher weighting:

SELECT *, MATCH(copy) AGAINST('wicked') AS score FROM ft_mroonga 
   WHERE MATCH(copy) AGAINST('wicked') ORDER BY score DESC;
+--------------------------------+--------+
| copy                           | score  |
+--------------------------------+--------+
| She met a wicked, wicked witch | 299594 |
| There was a wicked witch       | 149797 |
+--------------------------------+--------+

Parser

Mroonga permits you to set a different parser for searching by specifying the parser in the CREATE TABLE statement as a comment or, in older versions, changing the value of the mroonga_default_parser system variable.

For example:

CREATE TABLE ft_mroonga(copy TEXT,FULLTEXT(copy) COMMENT 'parser "TokenDelimitNull"') 
  ENGINE=Mroonga;, 

or

SET GLOBAL mroonga_default_parser = 'TokenBigramSplitSymbol';

The following parser settings are available:

Setting Description
off No tokenizing is performed.
TokenBigram Default value. Continuous alphabetical characters, numbers or symbols are treated as a token.
TokenBigramIgnoreBlank Same as TokenBigram except that white spaces are ignored.
TokenBigramIgnoreBlankSplitSymbol Same as TokenBigramSplitSymbol. except that white spaces are ignore.
TokenBigramIgnoreBlankSplitSymbolAlpha Same as TokenBigramSplitSymbolAlpha except that white spaces are ignored.
TokenBigramIgnoreBlankSplitSymbolAlphaDigit Same as TokenBigramSplitSymbolAlphaDigit except that white spaces are ignored.
TokenBigramSplitSymbol Same as TokenBigram except that continuous symbols are not treated as a token, but tokenised in bigram.
TokenBigramSplitSymbolAlpha Same as TokenBigram except that continuous alphabetical characters are not treated as a token, but tokenised in bigram.
TokenDelimit Tokenises by splitting on white spaces.
TokenDelimitNull Tokenises by splitting on null characters (\0).
TokenMecab Tokenise using MeCab. Required Groonga to be buillt with MeCab support.
TokenTrigram Tokenises in trigrams but continuous alphabetical characters, numbers or symbols are treated as a token.
TokenUnigram Tokenises in unigrams but continuous alphabetical characters, numbers or symbols are treated as a token.

Examples

TokenBigram vs TokenBigramSplitSymbol

TokenBigram failing to match partial symbols which TokenBigramSplitSymbol matches, since TokenBigramSplitSymbol does not treat continuous symbols as a token.

DROP TABLE ft_mroonga;
CREATE TABLE ft_mroonga(copy TEXT,FULLTEXT(copy) COMMENT 'parser "TokenBigram"') 
  ENGINE=Mroonga;
INSERT INTO ft_mroonga(copy) VALUES ('Once upon a time'),   
  ('There was a wicked witch'), 
  ('Who ate everybody up'), 
  ('She met a wicked, wicked witch'), 
  ('A really wicked, wicked witch!!?!');
SELECT * FROM ft_mroonga WHERE MATCH(copy) AGAINST('!?');
Empty set (0.00 sec)

DROP TABLE ft_mroonga;
CREATE TABLE ft_mroonga(copy TEXT,FULLTEXT(copy) COMMENT 'parser "TokenBigramSplitSymbol"') 
  ENGINE=Mroonga;
INSERT INTO ft_mroonga(copy) VALUES ('Once upon a time'),   
  ('There was a wicked witch'), 
  ('Who ate everybody up'), 
  ('She met a wicked, wicked witch'), 
  ('A really wicked, wicked witch!!?!');
SELECT * FROM ft_mroonga WHERE MATCH(copy) AGAINST('!?');
+-----------------------------------+
| copy                              |
+-----------------------------------+
| A really wicked, wicked witch!!?! |
+-----------------------------------+

TokenBigram vs TokenBigramSplitSymbolAlpha

DROP TABLE ft_mroonga;
CREATE TABLE ft_mroonga(copy TEXT,FULLTEXT(copy) COMMENT 'parser "TokenBigram"') 
  ENGINE=Mroonga;
INSERT INTO ft_mroonga(copy) VALUES ('Once upon a time'),   
  ('There was a wicked witch'), 
  ('Who ate everybody up'), 
  ('She met a wicked, wicked witch'), 
  ('A really wicked, wicked witch!!?!');
SELECT * FROM ft_mroonga WHERE MATCH(copy) AGAINST('ick');
Empty set (0.00 sec)

DROP TABLE ft_mroonga;
CREATE TABLE ft_mroonga(copy TEXT,FULLTEXT(copy) COMMENT 'parser "TokenBigramSplitSymbolAlpha"') 
  ENGINE=Mroonga;
INSERT INTO ft_mroonga(copy) VALUES ('Once upon a time'),   
  ('There was a wicked witch'), 
  ('Who ate everybody up'), 
  ('She met a wicked, wicked witch'), 
  ('A really wicked, wicked witch!!?!');
SELECT * FROM ft_mroonga WHERE MATCH(copy) AGAINST('ick');
+-----------------------------------+
| copy                              |
+-----------------------------------+
| There was a wicked witch          |
| She met a wicked, wicked witch    |
| A really wicked, wicked witch!!?! |
+-----------------------------------+
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.

© 2019 MariaDB
Licensed under the Creative Commons Attribution 3.0 Unported License and the GNU Free Documentation License.
https://mariadb.com/kb/en/mroonga-overview/