Multilingual analyzer with stemming for elastic search

I have an index with the following mapping:

    {
    "my_index": {
        "mappings": {
            "properties": {
                "teacher_id": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword"
                        }
                    }
                },
                "school_id": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword"
                        }
                    }
                },
                "name": {
                    "type": "text"
                },
                "height": {
                    "type": "text"
                },
                "family_name": {
                    "type": "text"
                },
                "studentID": {
                    "type": "keyword"
                },
                "performance_text": {
                    "type": "text",
                    "fields": {
                        "highlight": {
                            "type": "text",
                            "store": true,
                            "term_vector": "with_positions_offsets"
                        },
                       "stemmed": {
                            "type": "text",
                            "analyzer": "snowball",
                            "fielddata": true
                       }
                    }
                },
                "_class": {
                    "type": "keyword",
                    "index": false,
                    "doc_values": false
                }
            }
        }
    }
}

The performance_text field contains large texts. The search will always be on the performance_text field. I want the elastic search to be able to first anaylize the language of the text and then perform stemming dependent on the language of the text. Therefore, I need a multilingual analyzer with stemming. The current snowball analyzer is not able to stem non-english words. How can I update my index mapping to be able to achieve this?

Leave a Comment