정규식 쿼리 대신 MongoDB Search 사용

쿼리가 정규식 일치에 의존하는 경우 MongoDB Search 인덱스 생성하고 $search 집계 파이프라인 단계 실행 쿼리 의 성능과 효율성 개선할 수 있습니다. $regex 는 항상 인덱스를 사용할 수는 없기 때문에 비효율적인 반면, MongoDB Search 인덱스 는 쿼리 성능을 크게 향상시키고 쿼리 매개변수를 사용자 지정할 수 있는 더 많은 옵션을 제공합니다.

이 페이지에서는 $regex 사용 사례에 대한 몇 가지 일반적인 MongoDB Search 인덱스 및 쿼리 구성에 대해 설명합니다.

예시

이 예제에서는 sample_mflix.movies 네임스페이스 사용합니다. 샘플 쿼리를 실행 하려면 이 컬렉션 클러스터 에 추가 하거나 MongoDB Search 플레이그라운드에서 사전 구성된 스냅샷을 사용하세요. 샘플 쿼리는 다음 $regex 사용 사례에서 대신 를 사용하는 방법을 보여줍니다.$search

접두사 쿼리

애플리케이션 에서 문자 또는 접두사 설정하다 로 시작하는 string 값을 자주 쿼리하는 경우 string 값의 시작부터 검색하는 $regex 옵션 ^와 대소문자를 구분하는 i를 사용할 수 있습니다. 둔감합니다.

대신 집계 파이프라인 단계를 $search 사용하는 MongoDB Search 쿼리를 권장합니다. 다음 쿼리는 접두사 back로 시작하는 영화 제목을 검색 .

➤ MongoDB Search 플레이그라운드에서 시도해 보세요.

$regex 쿼리

$search 쿼리

db.movies.find( { title: { $regex: /^back/i } }, { title: 1, _id: 0 } )  // Query 1
db.movies.find( { title: { $regex: "^back", $options: "i" } }, { title: 1, _id: 0 } )  // Query 2

[
  { title: 'Back to the Future' },
  { title: 'Back to School' },
  { title: 'Back to the Future Part II' },
  { title: 'Back to the USSR - takaisin Ryssiin' },
  { title: 'Back to the Future Part III' },
  { title: 'Backdraft' },
  { title: 'Backbeat' },
  { title: 'Backstage' },
  { title: 'Backdoor' },
  { title: 'Backstage' },
  { title: 'Back Soon' },
  { title: 'Backlight' },
  { title: 'Back to Stay' },
  { title: 'Back Issues: The Hustler Magazine Story' }
]

db.movies.aggregate([
  {
    "$search": {
      "index": "default",
      "text": {
        "query": "back",
        "path": "title",
        "matchCriteria": "all"
      }
    }
  },
  {
    "$project": {
      "_id": 0,
      "title": 1,
      "score": { $meta: "searchScore" }
    }
  }
])

[
  { title: 'Backdraft', score: 3.8287878036499023 },
  { title: 'Backbeat', score: 3.8287878036499023 },
  { title: 'Backstage', score: 3.8287878036499023 },
  { title: 'Backdoor', score: 3.8287878036499023 },
  { title: 'Backstage', score: 3.8287878036499023 },
  { title: 'The Backwoods', score: 3.8287878036499023 },
  { title: 'The Backwoods', score: 3.8287878036499023 },
  { title: 'The Way Back', score: 3.8287878036499023 },
  { title: '3 Backyards', score: 3.8287878036499023 },
  { title: 'Backlight', score: 3.8287878036499023 },
  { title: 'The Way Way Back', score: 3.8287878036499023 },
  { title: 'Back to the Future', score: 3.455096483230591 },
  { title: 'Back to School', score: 3.455096483230591 },
  { title: 'The Cat Came Back', score: 3.455096483230591 },
  { title: "Jack's Back", score: 3.455096483230591 },
  { title: 'The Dark Backward', score: 3.455096483230591 },
  { title: 'T-Rex: Back to the Cretaceous', score: 3.455096483230591 },
  { title: 'The Dark Backward', score: 3.455096483230591 },
  { title: 'No Turning Back', score: 3.455096483230591 },
  { title: "The Devil's Backbone", score: 3.455096483230591 }
]
Type "it" for more

이 $search 쿼리 실행 하려면 다음과 유사한 MongoDB Search 인덱스 만듭니다.

{
  "mappings": {
    "dynamic": false,
    "fields": {
      "title": [
        {
          "type": "string",
          "analyzer": "autocomplete-search",
          "searchAnalyzer": "lucene.standard"
        }
      ]
    }
  },
  "analyzers": [
    {
      "name": "autocomplete-search",
      "tokenizer": {
        "type": "standard"
      },
      "tokenFilters": [
        {
          "type": "lowercase"
        },
        {
          "type": "edgeGram",
          "minGram": 4,
          "maxGram": 10
        }
      ]
    }
  ]
}

이 인덱스 정의는 movies 컬렉션 의 title 필드 인덱싱된 필드에는 autocomplete-search 사용자 지정 분석기 사용하고 쿼리에는 lucene.standard 분석기 사용하는 string 유형으로 인덱싱합니다. 사용자 지정 분석기 autocomplete-search 를 인덱싱된 필드의 경우 analyzer 로, lucene.standard 을 쿼리의 경우 searchAnalyzer 로 명명했습니다. 사용자 지정 분석기

lowercase 대소문자를 구분하지 않는 쿼리를 지원 하기 위해 모든 문자를 소문자로 변환하는 토큰 필터하는
edgeGram 필터를 사용하여 4 ~ 10 자 길이의 토큰 생성

참고

이 사용자 지정 분석기 최대 10자 길이의 단어만 지원합니다. 단어와 쿼리가 10자를 초과해야 하는 경우 maxGram 값을 늘리세요. maxGram 값을 15보다 높게 설정하면 인덱스 크기가 커지고 성능 및 가용성에 영향 수 있으므로 값을 15보다 높게 설정하지 않는 것이 좋습니다.

하위 문자열 '포함' 쿼리

애플리케이션 에서 필드 어느 곳에나 있는 문자열을 자주 쿼리하는 경우 $regex 쿼리를 실행 모든 문서 확인하고 특정 순서 없이 일치하는 모든 항목을 반환할 수 있습니다.

대신 집계 파이프라인 단계를 $search 사용하는 MongoDB Search 쿼리를 권장합니다. 다음 쿼리는 title 필드 에 park 라는 텀 포함된 영화 제목을 검색 .

➤ MongoDB Search 플레이그라운드에서 시도해 보세요.

$regex 쿼리

$search 쿼리

db.movies.find({ title: { $regex: "park", $options: "i" } }, { title: 1, _id: 0 })

[
  { title: 'Barefoot in the Park' },
  { title: 'The Panic in Needle Park' },
  { title: 'Gorky Park' },
  { title: 'The Park Is Mine' },
  { title: 'Jurassic Park' },
  { title: 'Mrs. Parker and the Vicious Circle' },
  { title: 'The Lost World: Jurassic Park' },
  { title: 'Dog Park' },
  { title: 'South Park: Bigger Longer & Uncut' },
  { title: 'Jurassic Park III' },
  { title: 'Mansfield Park' },
  { title: 'Jurassic Park III' },
  { title: 'Gosford Park' },
  { title: 'The Rosa Parks Story' },
  { title: 'The Delicate Art of Parking' },
  { title: 'Wicker Park' },
  { title: 'Chestnut: Hero of Central Park' },
  { title: 'Trailer Park Boys: The Movie' },
  { title: 'Ellie Parker' },
  { title: 'Paranoid Park' }
]

db.movies.aggregate([
  {
    "$search": {
      "index": "default",
      "wildcard": {
        "query": "park*",
        "path": "title",
        "allowAnalyzedField": true
      }
    }
  },
  {
    "$project": {
      "_id": 0,
      "title": 1,
      "score": { "$meta": "searchScore" }
    }
  }
])

[
  { title: 'Barefoot in the Park', score: 1 },
  { title: 'The Panic in Needle Park', score: 1 },
  { title: 'Gorky Park', score: 1 },
  { title: 'The Park Is Mine', score: 1 },
  { title: 'Jurassic Park', score: 1 },
  { title: 'Mrs. Parker and the Vicious Circle', score: 1 },
  { title: 'The Lost World: Jurassic Park', score: 1 },
  { title: 'Dog Park', score: 1 },
  { title: 'South Park: Bigger Longer & Uncut', score: 1 },
  { title: 'Jurassic Park III', score: 1 },
  { title: 'Mansfield Park', score: 1 },
  { title: 'Jurassic Park III', score: 1 },
  { title: 'Gosford Park', score: 1 },
  { title: 'The Rosa Parks Story', score: 1 },
  { title: 'Wicker Park', score: 1 },
  { title: 'The Delicate Art of Parking', score: 1 },
  { title: 'Chestnut: Hero of Central Park', score: 1 },
  { title: 'Trailer Park Boys: The Movie', score: 1 },
  { title: 'Ellie Parker', score: 1 },
  { title: 'Paranoid Park', score: 1 }
]
Type "it" for more

이 $search 쿼리 실행 하려면 다음 정의를 사용하여 MongoDB Search 인덱스 만듭니다.

{
  "mappings": {
    "dynamic": false,
    "fields": {
      "title": {
        "type": "string",
        "analyzer": "contains",
        "searchAnalyzer": "lucene.standard"
      }
    }
  },
  "analyzers": [
    {
      "name": "contains",
      "tokenizer": {
        "type": "standard"
      },
      "tokenFilters": [
        {
          "type": "lowercase"
        },
        {
          "type": "reverse"
        },
        {
          "type": "edgeGram",
          "minGram": 4,
          "maxGram": 15
        },
        {
          "type": "reverse"
        }
      ]
    }
  ]
}

이 인덱스 정의는 다음을 적용하는 contains 이라는 사용자 지정 분석기 사용하여 movies 컬렉션 의 title 필드 string 유형으로 인덱싱합니다.

standard 토크나이저 사용하여 단어를 공백이나 구두점으로 분할 .
lowercase 토큰 필터하다 사용하여 문자를 소문자로 변환하여 대소문자를 구분하지 않는 쿼리를 지원 .
reverse 토큰 필터 (두 번)를 사용하여 단어를 반전시켜 효율적인 앵커되지 않은 쿼리를 지원하다.
edgeGram 토큰 필터를 사용하여 4~15자 길이의 토큰을 생성합니다.

참고

이 사용자 지정 분석기 최대 15자 길이의 단어만 지원합니다. 15자를 초과하는 단어가 있는 경우 maxGram 값을 늘립니다. maxGram 값을 15보다 높게 설정하면 인덱스 크기가 늘어나고 성능 및 가용성에 영향 수 있으므로 값을 15보다 높게 설정하다 하지 않는 것이 좋습니다.

접미사 쿼리

애플리케이션 에서 문자 또는 접미사 설정하다 로 끝나는 문자열 필드 값을 자주 쿼리하는 경우 $regex 옵션 $ 및 옵션 i을 사용하여 정규식 쿼리를 실행 수 있습니다. 대소문자를 구분하지 않습니다.

대신 집계 파이프라인 단계를 $search 사용하는 MongoDB Search 쿼리를 권장합니다. 다음 쿼리는 ring라는 텀 로 끝나는 영화 제목을 검색 .

➤ MongoDB Search 플레이그라운드에서 시도해 보세요.

$regex 쿼리

$search 쿼리

db.movies.find( { title: { $regex: "ring$" } }, { title: 1, _id: 0 } ) // Case-sensitive Query 1
db.movies.find( { title: { $regex: /ring$/ } }, { title: 1, _id: 0 } ) // Case-sensitive Query 2
db.movies.find( { title: { $regex: /ring$/i } }, { title: 1, _id: 0 } ) // Case-insensitive Query 1
db.movies.find( { title: { $regex: "ring$", $options: "i" } }, { title: 1, _id: 0 } ) // Case-insensitive Query 2

[
  { title: 'It Happens Every Spring' },
  { title: 'Larks on a String' },
  { title: 'Release the Prisoners to Spring' },
  { title: 'Manon of the Spring' },
  { title: 'Floundering' },
  { title: 'Autumn Spring' },
  { title: 'The Gathering' },
  { title: 'Blue Spring' },
  { title: 'Blue Spring' },
  { title: 'Girl with a Pearl Earring' },
  { title: 'Spring, Summer, Fall, Winter... and Spring' },
  { title: 'Breaking and Entering' },
  { title: 'Hunting and Gathering' },
  { title: 'Blood Tea and Red String' },
  { title: 'Warm Spring' },
  { title: 'The Conjuring' },
  { title: 'Thanks for Sharing' },
  { title: 'Leaving on the 15th Spring' }
]

db.movies.aggregate([
  {
    "$search": {
      "index": "default",
      "autocomplete": {
        "query": "ring",
        "path": "title",
      }
    }
  },
  {
    "$project": {
      "_id": 0,
      "title": 1,
      "score": { $meta: "searchScore" }
    }
  }
])

[
  { title: 'It Happens Every Spring', score: 4.683838844299316 },
  { title: 'Larks on a String', score: 4.683838844299316 },
  {
    title: 'Release the Prisoners to Spring',
    score: 4.683838844299316
  },
  { title: 'Manon of the Spring', score: 4.683838844299316 },
  { title: 'Floundering', score: 4.683838844299316 },
  {
    title: 'The Lord of the Rings: The Fellowship of the Ring',
    score: 4.683838844299316
  },
  { title: 'Autumn Spring', score: 4.683838844299316 },
  { title: 'The Gathering', score: 4.683838844299316 },
  { title: 'The Ring', score: 4.683838844299316 },
  { title: 'Tom and Jerry: The Magic Ring', score: 4.683838844299316 },
  { title: 'Blue Spring', score: 4.683838844299316 },
  { title: 'Blue Spring', score: 4.683838844299316 },
  { title: 'Girl with a Pearl Earring', score: 4.683838844299316 },
  {
    title: 'Spring, Summer, Fall, Winter... and Spring',
    score: 4.683838844299316
  },
  { title: 'Curse of the Ring', score: 4.683838844299316 },
  { title: 'Breaking and Entering', score: 4.683838844299316 },
  { title: 'Closing the Ring', score: 4.683838844299316 },
  { title: 'Hunting and Gathering', score: 4.683838844299316 },
  { title: 'Blood Tea and Red String', score: 4.683838844299316 },
  { title: 'Warm Spring', score: 4.683838844299316 }
]
Type "it" for more

이 $search 쿼리 실행 하려면 다음과 유사한 MongoDB Search 인덱스 만듭니다.

{
  "mappings": {
    "dynamic": false,
    "fields": {
      "title": [
        {
          "type": "autocomplete",
          "minGrams": 4,
          "maxGrams": 10,
          "analyzer": "lucene.keyword",
          "tokenization": "rightEdgeGram"
        }
      ]
    }
  }
}

이 인덱스 정의는 다음을 사용하여 title 필드 인덱싱합니다.

rightEdgeGram 토큰화 전략을 사용하여 텍스트를 4 (최소)~ 10 (최대)자 길이의 하위 문자열 또는 "그램"으로 분할 autocomplete 유형으로, 다음 끝에서 시작하는 부분 검색을 지원합니다. 문자열.
lucene.keyword 분석기 하여 중간 단어의 끝이 아닌 텍스트 끝에서만 일치하는지 확인합니다. 중간 단어에서 일치하는 접미사를 찾으려면 lucene.standard을 사용합니다.

자세히 알아보기

MongoDB Search 쿼리에 대해 자세히 학습하려면 쿼리 및 인덱스를 참조하세요.
MongoDB의 정규식 쿼리에 대해 자세히 알아보려면 $regex를 참조하세요.
MongoDB University는 MongoDB 성능 최적화에 대한 무료 과정을 제공합니다. 자세한 내용은 모니터링 및 인사이트를 참조하세요.