Global Site-Wide Search

Denigma needs a omnipotent search function which is able to search all fields of all tables or only a specified subset as it was similarly implemented in Denigma's Wiki.

There are numerous ways on how to implement searching. One option would be Haystack/Whoosh and another Xapian/Djapian [http://www.vlent.nl/weblog/2010/10/14/searching-django-site-part-1-what-and-why/]. For Denigma it was decided to use the former as they are clean pure python implementations.

Haystack and whoosh need to be added to the requirements:

...
# whoosh
-e git://github.com/toastdriven/django-haystack.git@master#egg=django-haystack
...

Haystack has to be added to the INSTALLED_APPS within the settings.py:

...
INSTALLED_APPS = [
    ...
    'haystack',
    ...

Specify the Haystack connections, e.g. for Whoosh (set the PATH to place the Whoosh index on the filesystem):

HAYSTACK_CONNECTIONS = {
    'default': {
        'ENGINE': 'haystack.backends.whoosh_backend.WhooshEngine',
        'PATH': os.path.join(os.path.dirname(__file__), 'whoosh_index'), # Use `PROJECT_ROOT` instead of `__file__`.
    },
 }

Create search_indexes.py in the corresponding app folder and there define a text field with docuement=True to indicate that it is the primary field for searching within.The conversion is to name it text as it has to be consistent across all SearchIndex classes:

import datetime

from haystack import indexes

from models import Post


class PostIndex(indexes.SearchIndex, indexes.Indexable):
    created = indexes.DateTimeField(model_attr='created')
    updated = indexes.DateTimeField(model_attr='updated')

    text = indexes.CharField(document=True, use_template=True)
    tags = indexes.MultiValueField()

    def get_model(self):
        return Post

    def index_queryset(self):
        """Used when the entire index for model is updated."""
        return self.get_model().objects.filter(created__lte=datetime.datetime.now())

Adding an custom optional index_queryset method allows to filter for only a specified selection of objects, for instance in order to allow user to add future content in ahead.

If use_template=True was provided for the main search field, an additional file _text.txt need to be created in the template directory called search/indexes/myapp/_text.txt and the following needs to be placed within it:

{{ object.title }}
{{ object.creator.get_full_name }}
{{ object.text }}

Add the search view to the URLconf:

...
(r'^search/', include('haystack.urls')),
...

Lastly reindex by running the following command:

$ ./manage.py rebuild_index

Unsure that whoosh_index is writeable:

$ chmod 777 whoosh_index

An update can be performed by running:

$ ./manage.py update_index

The number of search results displayed per page can be set in the global configuration:

# settings.py
...
HAYSTACK_SEARCH_RESULTS_PER_PAGE = 25
...

The richard project [https://github.com/willkg/richard'>https://github.com/willkg/richard">https://github.com/willkg/richard] which is used by pyvideo.org [http://pyvideo.org/search/?models=videos.video&]q=django+customizing] is an excellent example for the implementation of these libraries.

The search template should really be redesigned and perform a default search. The results need to be better annotated (e.g. from which model the information stems and in which context the term was found. Spell correction and auto-completion should be included. The global search field should be in gray if not selected and placed more in the center of the upper navigation panel.

The rebuild_indexes need to be automated and performed regularly. An alternative real-time search function can be implemented.

Optionally other search engines can be utilized.

Look into the Haystack documentation for more details on implementation way to accomplish the above proposed enhancement and other functionalities of value [http://django-haystack.readthedocs.org/en/latest/].

o

global-search.jpg

Tags: searching, indexing, programming, rest, django
Update | Engage

Comment on This Data Unit