/ /

/ /

ホワイトスペースアナライザ

whitespaceアナライザは、空白文字を見つけた場合、テキストを検索可能なターム（トークン）に分割します。すべてのテキストは元の文字大文字と小文字のままになります。

Refine Your Index を選択した場合、Atlas UI はIndex Configurations セクション内に View text analysis of your selected index configuration というタイトルのセクションを表示します。このセクションを展開すると、Atlas UI には、whitespaceアナライザが各サンプル列に対して生成するインデックスと検索トークンが表示されます。 Atlas UI Visual Editor でインデックスを作成または編集すると、whitespaceアナライザが組み込みサンプルドキュメントとクエリ文字列用に作成するトークンが確認できます。

重要

MongoDB Search は、アナライザトークンのサイズが 32766 バイトを超える、string フィールドのインデックスません。キーワードアナライザを使用している場合、32766 バイトを超える string フィールドはインデックス化されません。

例

次のインデックス定義の例では、 whitespaceアナライザを使用して、sample_mflix.moviesコレクションの titleフィールドのインデックスを指定します。この例に従うには、クラスターにサンプルデータをロードし、mongosh を使用するか、MongoDB 検索インデックスの作成チュートリアルの手順に従って Atlas UI の Create a Search Index ページに移動します。

次に、movies コレクションをデータソースとして使用し、または Atlasmongosh UIVisual Editor またはJSON editor からインデックスを作成する例の手順に従います。

➤ 言語を選択 ドロップダウンメニューを使用して、このページの例のインターフェイスを設定します。

インデックスを設定するには、 Refine Your Indexをクリックします。
Index Configurations セクションで、Dynamic Mapping を off に切り替えます。
Field Mappingsセクションで、 Add FieldをクリックしてAdd Field Mappingウィンドウを開きます。
Field Nameドロップダウンからtitleを選択します。
[Customized Configuration] をクリックします。
[ Data Type String選択されていない場合は選択します。

String Propertiesを展開し、次の変更を加えます。

インデックスアナライザ	ドロップダウンから [`lucene.whitespace`] を選択します。
searchAnalyzer	ドロップダウンから [`lucene.whitespace`] を選択します。
インデックスオプション	デフォルトの`offsets`を使用します。
Store	デフォルトの`true`を使用します。
上記を無視	デフォルト設定のままにしてください。
基準	デフォルトの`include`を使用します。

[Add] をクリックします。
[Save Changes] をクリックします。
[Create Search Index] をクリックします。

デフォルトのインデックス定義を、以下のインデックス定義で置き換えます。

{
  "mappings": {
    "fields": {
      "title": {
        "type": "string",
        "analyzer": "lucene.whitespace",
        "searchAnalyzer": "lucene.whitespace"
      }
    }
  }
}

[Next] をクリックします。
[Create Search Index] をクリックします。

1 db.movies.createSearchIndex(
2   "default",
3   {   
4     "mappings": {
5       "fields": {
6         "title": {
7           "type": "string",
8           "analyzer": "lucene.whitespace",
9           "searchAnalyzer": "lucene.whitespace"
10         }
11       }
12     }
13   }
14 )

次のクエリは、 titleフィールドでLion'sというタームを検索します。

インデックスの Query ボタンをクリックします。
クエリを編集するには、Edit Query をクリックします。
クエリバーをクリックし、データベースとコレクションを選択します。

デフォルトのクエリを以下のように置き換え、Find をクリックします。

[
  {
    "$search": {
      "text": {
        "query": "Lion's",
        "path": "title"
      }
    }
  }
]

SCORE: 3.7370920181274414  _id:  "573a13ebf29313caabdcfc8d"
   awards: Object
   cast: Array (4)
   countries: Array (1)
   directors: Array (1)
   fullplot: "A documentary on young actress, Marianna Palka, as she confronts her r…"
   genres: Array (3)
   imdb: Object
   languages: Array (1)
   lastupdated: "2015-09-03 00:37:45.227000000"
   num_mflix_comments: 0
   plot: "A documentary on young actress, Marianna Palka, as she confronts her r…"
   poster: "https://m.media-amazon.com/images/M/MV5BMTgzMTc2OTg2N15BMl5BanBnXkFtZT…"
   released: 2014-01-18T00:00:00.000+00:00
   runtime: 15
   title: "The Lion's Mouth Opens"
   type: "movie"
   writers: Array (1)
   year: 2014

db.movies.aggregate([
  {
    "$search": {
      "text": {
         "query": "Lion's",
         "path": "title"
      }
    }
  },
  {
    "$project": {
      "_id": 0,
      "title": 1
    }
  }
])

[ { title: "The Lion's Mouth Opens" } ]

MongoDB Search では、lucene.whitespaceアナライザを使用して、titleフィールドのテキストに対して次の操作を実行してこれらのドキュメントが返されます。

テキストの元の文字大文字と小文字を保持します。
空白文字が見つかったテキストをトークンに分割します。

次の表は、 MongoDB Search が結果内のドキュメントに対してホワイトスペースアナライザと単純アナライザおよびキーワードアナライザを使用して作成するトークン（検索可能なターム）を示しています。

タイトル	空白アナライザトークン	簡易アナライザトークン	キーワードアナライザトークン
`The Lion's Mouth Opens`	`The`, `Lion's`, `Mouth`, `Opens`	`the`, `lion`, `s`, `mouth`, `opens`	`The Lion's Mouth Opens`

whitespaceアナライザを使用するインデックスは大文字と小文字を区別します。したがって、 MongoDB Search はクエリタームLion's を whitespaceアナライザによって作成されたトークン Lion's と照合できます。

戻る

簡単

Keyword

1	db.movies.createSearchIndex(
2	"default",
3	{
4	"mappings": {
5	"fields": {
6	"title": {
7	"type": "string",
8	"analyzer": "lucene.whitespace",
9	"searchAnalyzer": "lucene.whitespace"
10	}
11	}
12	}
13	}
14	)