Haystack extension indexing (Elasticsearch)

Keywords: Python Big Data ElasticSearch search engine

Get started with python Programming quickly (continuous update...)

python actual combat project (Django technical point)

Tips:
At the bottom of Elasticsearch is the open source library Lucene. However, Lucene cannot be used directly. You must write your own code to call its interface.
reflection:
How do we connect to Elasticsearch server?
Solution:
Haystack

1. Haystack introduction and installation configuration

1. Introduction to haystack

Haystack is a framework for docking search engines in Django, and builds a communication bridge between users and search engines.
In Django, we can call the Elasticsearch search search engine by using Haystack.
Haystack can use different search back ends (such as elastic search, Whoosh, Solr, etc.) without modifying the code.

2.Haystack installation

pip install django-haystack
pip install elasticsearch==2.4.1

3.Haystack registration application and routing

INSTALLED_APPS = [
'haystack', # full text search
]

4.Haystack configuration

Configure Haystack as the search engine backend in the configuration file

# Haystack
HAYSTACK_CONNECTIONS = {
    'default': {
        'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine',
        'URL': 'http://192.168.217.128:9200/', # Elasticsearch server ip address, port number fixed at 9200
        'INDEX_NAME': 'meiduo', # The name of the index library created by Elasticsearch
    },
}

When data is added, modified or deleted, the index # haystack is automatically generated_ SIGNAL_ PROCESSOR = ‘haystack.signals.RealtimeSignalProcessor’
Important:

Haystack_ SIGNAL_ The process configuration item ensures that when new data is generated after Django runs, Haystack can still let Elasticsearch generate the index of new data in real time

5. Change DIRS of TEMPLATES in setting.py as follows:

'DIRS': [os.path.join(BASE_DIR,'templates')],

2. Haystack establishes data index

1. Create index class

By creating an index class, you can specify which fields to let the search engine index, that is, which field keywords can be used to retrieve data.

In this project, SKU information is searched in full text, so a new search is created in the goods application_ Indexes.py file, used to store index classes.

from haystack import indexes
from .models import SKU

class SKUIndex(indexes.SearchIndex, indexes.Indexable):
    """SKU Index data model class"""
    text = indexes.CharField(document=True, use_template=True)

    def get_model(self):
        """Returns the indexed model class"""
        return SKU

    def index_queryset(self, using=None):
        """Returns the data query set to be indexed"""
        return self.get_model().objects.filter(is_launched=True)

Description of index class SKUIndex:

The fields established in SKUIndex can be queried by Elasticsearch search search engine with the help of Haystack.

The text field is declared as document=True, and the table name field is mainly used for keyword query.

The index value of the text field can be composed of multiple database model class fields. We use use use to specify which model class fields are composed of_ Template = true indicates that it is indicated by the template later.

2. Create a text field index value template file

Create a template file for the text field in the templates directory

See templates / search / indexes / goods / SKU for details_ Defined in the text.txt file

{{ object.id }}
{{ object.name }}
{{ object.caption }}

Template file description: when the keyword is passed through the text parameter name

This template indicates that the id, name and caption of the SKU are used as the index values of the text field for keyword index query.

3. Manually generate the initial index

python manage.py rebuild_index

4. Add back-end logic

1. Add the following code to the goods.views.py file:

# Import: from haystack.views import searchview from django.http import jsonresponse
class MySearchView(SearchView):
    '''rewrite SearchView class'''
    def create_response(self):
        # Get search results
        context = self.get_context()
        data_list = []
        for sku in context['page'].object_list:
            data_list.append({
                'id':sku.object.id,
                'name':sku.object.name,
                'price':sku.object.price,
                'default_image_url':sku.object.default_image.url,
                'searchkey':context.get('query'),
                'page_size':context['page'].paginator.num_pages,
                'count':context['page'].paginator.count
            })
        # Splicing parameters, return
        return JsonResponse(data_list, safe=False)

2. Add sub route goods.urls.py

Search routing – note: no as_view()

path('search/', views.MySearchView()),

3. Set the number of returned data per page

Via HAYSTACK_SEARCH_RESULTS_PER_PAGE can control the number of displays per page

Five pieces of data are displayed on each page: HAYSTACK_SEARCH_RESULTS_PER_PAGE = 5

Posted by rtadams89 on Tue, 26 Oct 2021 02:51:47 -0700