linuxlewis / djorm-ext-pgfulltext Goto Github PK
View Code? Open in Web Editor NEWPostgreSQL full-text search integration with django orm.
License: Other
PostgreSQL full-text search integration with django orm.
License: Other
In particular it can help to recognize misspelled input words that will not be matched directly by the full text search mechanism.
pg_trgm Text Search Integration from PostgreSQL documentation
I am trying to use this framework for an autocomplete. I would like to start searching at 3 characters... however the framework does not return any results.
Getting an error with passing a connection == None with the latest in the master branch.
Traceback:
File "/python3.4/site-packages/django/core/handlers/base.py" in get_response
149. response = self.process_exception_by_middleware(e, request)
File "/python3.4/site-packages/django/core/handlers/base.py" in get_response
147. response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/python3.4/site-packages/django/views/generic/base.py" in view
68. return self.dispatch(request, *args, **kwargs)
File "/python3.4/site-packages/django_core/views/mixins/paging.py" in dispatch
27. return super(PagingViewMixin, self).dispatch(*args, **kwargs)
File "/python3.4/site-packages/django/views/generic/base.py" in dispatch
88. return handler(request, *args, **kwargs)
File "/webapps/mm/app/search/views.py" in get
27. return self.form_valid(form=self.form)
File "/webapps/mm/app/search/views.py" in form_valid
40. return ListView.get(self, request=self.request)
File "/python3.4/site-packages/django/views/generic/list.py" in get
159. self.object_list = self.get_queryset()
File "/webapps/mm/app/search/views.py" in get_queryset
87. **self.get_search_kwargs()
File "/webapps/mm/app/search/managers.py" in search
38. headline_document=headline_document
File "/python3.4/site-packages/djorm_pgfulltext/models.py" in search
129. return self.get_queryset().search(*args, **kwargs)
File "/python3.4/site-packages/djorm_pgfulltext/models.py" in search
315. "%s('%s', %s)" % (function, config, adapt(query))
File "/python3.4/site-packages/djorm_pgfulltext/utils.py" in adapt
9. a.prepare(connection.connection)
Exception Type: TypeError at /search
Exception Value: must be psycopg2.extensions.connection, not None
Thanks for this package, it's very helpful.
I have some models that have very few actual fields (that store raw data) and a bunch of properties that parse the raw field into more useful output.
I'd like to be able to add the value of these properties to the search index, but from reading the code, that doesn't seem possible? Is there any reason for this?
I'm willing to contribute a pull request if you think it could be done.
The docs say this should work up to Django 1.6. Does this work with Django 1.7 or higher?
Django 1.8.3
config = ('pg_catalog.english')
works ok
config = ('pg_catalog.english', 'pg_catalog.simple'),
raise
File "D:\project\src\testproj\apps\item\views\baseviews.py", line 170, in get
qs = qs.search(str(request.query_params['q']))
File "D:\project\src\testproj\apps\djorm_pgfulltext\models.py", line 315, in search
"%s('%s', %s)" % (function, config, adapt(query))
File "D:\project\src\testproj\apps\djorm_pgfulltext\utils.py", line 9, in adapt
a.prepare(connection.connection)
TypeError: must be psycopg2._psycopg.connection, not None
I tried building a mixed manager but it still does not work. I thought I would throw it over to your fence and see what can be done.
This is the super simple manger I am using for 'objects':
class SearchGeoManager(SearchManagerMixIn, models.GeoManager):
pass
When I do a search I get:
AttributeError: 'GeoQuerySet' object has no attribute 'search'
Perhaps some clues can be found in: https://docs.djangoproject.com/en/dev/topics/db/managers/
The following code raises IndexError: tuple index out of range
:
qs = Person2.objects.search(query="Pepa%")
Hi, in my old app i integrate full search index but only index or find the last objects save, this is my code
search_index = VectorField()
objects = models.Manager()
search_manager = SearchManager(
fields=('publicacion', 'resumen', 'claves'),
config='pg_catalog.english',
search_field='search_index',
auto_update_search_field=True
)
than apply south i using django 1.6.5 all ok
tipo_publicacion_id | integer |
search_index | tsvector |
Indexes:
"publicaciones_publicacion_pkey" PRIMARY KEY, btree (id)
"publicaciones_publicacion_search_index" btree (search_index)
Foreign-key constraints:
...
Referenced by:
....
when i find any data in publicacion, resumen and claves not found anything except the last records i save, ok i thin maybe i need create index and put this in console psql.
create index publicaciones_publicacion_search_index on publicaciones_publicacion using gin(search_index);
ERROR: relation "publicaciones_publicacion_search_index" already exists
which is the best way for me to look in my table? any tips or trick find all my data only hace 8,000.00 records and only 1 find the last :(
thank for any direction o link i need read
Right now the only way to update a VectorField is to write out a literal tsvector
, e.g. 'a:1 fat:2 cat:3 sat:4 on:5 a:6 mat:7 and:8 ate:9 a:10 fat:11 rat:12'
or 'a:1A fat:2B,4C cat:5D'
. While this is obviously very flexible, most of the time I just want to call to_tsvector
.
Here's a quick overview of the problem:
# search/models.py
from django.db import models
from djorm_pgfulltext.fields import VectorField
from djorm_pgfulltext.models import SearchManager
class SearchTest(models.Model):
search_index = VectorField()
objects = SearchManager()
In [1]: from search.models import SearchTest
In [2]: search_test = SearchTest()
In [3]: search_test.search_index = 'swim swimming swam'
In [4]: search_test.save()
In [5]: search_test = SearchTest.objects.get(id=search_test.id) # Reload model instance
In [6]: search_test.search_index
Out[6]: "'swam' 'swim' 'swimming'"
# The string was literally inserted as a ts vector
# I would rather it be converted and stemmed: "'swam':3 'swim':1,2"
I think that inserting a string as a literal tsvector is a fine default, but I still needed to be able to call to_tsvector
. I got around that by creating a special python object and registering an adapter with psycopg2.
# search/tsvector.py
from psycopg2.extensions import adapt, AsIs
class TsVector(object):
""" Represents a call to to_tsvector at the database level.
Use:
TsVector('swim swimming swam'),
TsVector('simple', 'swim swimming swam')
TsVector('english', 'swim swimming swam')
"""
def __init__(self, *args):
assert len(args) in (1, 2), "Arguments should be TsVector([ config regconfig, ] document text)"
if len(args) == 1:
self.config = None
self.document = args[0]
else:
self.config = args[0]
self.document = args[1]
def adapt_tsvector(tsvector):
""" Adapts TsVector object for use in DB.
"""
if tsvector.config is None:
return AsIs("to_tsvector(%s)" % adapt(tsvector.document))
else:
return AsIs("to_tsvector(%s, %s)" % (adapt(tsvector.config), adapt(tsvector.document)))
Put this somewhere that it'll only get executed once. With Django 1.7, an AppConfig
is a pretty natural place to put it.
# search/apps.py
from django.apps import AppConfig
from psycopg2.extensions import register_adapter
from search.tsvector import TsVector, adapt_tsvector
class SearchConfig(AppConfig):
name = 'search'
verbose_name = "Search"
def ready(self):
# Register the TsVector class
register_adapter(TsVector, adapt_tsvector)
Example use:
In [1]: from search.models import SearchTest
In [2]: from search.tsvector import TsVector
In [3]: search_test = SearchTest()
In [4]: search_test.search_index = TsVector('english', 'swim swimming swam')
In [5]: search_test.save()
In [6]: search_test = SearchTest.objects.get(id=search_test.id)
In [7]: search_test.search_index
Out[7]: "'swam':3 'swim':1,2"
Thoughts?
Hi, I need your advise.
I have next model
class Event(models.Model):
events = SearchManager(
fields=('name', 'description'),
config='pg_catalog.english',
search_field='search_index',
auto_update_search_field=True
)
name = models.CharField('event title', max_length=250)
description = RichTextField(blank=True)
search_index = VectorField()
When I use long description(for example I copy all text from page http://www.paulgraham.com/avg.html), I got "index reow requires ..." error.
How can I prevent this error.
Now I found solution only to limit description field length, but may be you can advise me better solution.
Is it possible to do a greedy match like GNU grep does? For instance, if I searched using the token "ran" it would find "orange"?
Can this be used to index data from the model being indexed's related objects? For example, we have a tags field on a parent model like so:
tags = models.ManyToManyField(Tag, blank=True, null=True, related_name='tags')
tag model looks like:
class Tag(models.Model):
name = models.CharField(max_length=100, unique=True)
Is it possible to get the tag names on the index of our parent model, and have it update when tags are added/removed? The example provided by you seems to only use flat fields. Otherwise don't see any reason to use this over an external search solution, since it would be much faster.
Django 1.7.7 gives the following warning:
RemovedInDjango18Warning: SearchManagerMixIn.get_query_set
method should be renamed get_queryset
.
In order for djorm-ext-pgfulltext to work in combination with Django 1.8 this will need te be renamed. I don't know if there is an easy workaround for older Django versions.
More information can be found here:
https://code.djangoproject.com/ticket/15363
Would be great to have more visibility to test output as well as transparency to new PRs ensuring tests all still pass. The task here would be to setup tests to run in travis-ci:
@linuxlewis, you have to be the one to add the travis-ci service since you're the repo owner
This will also help test in other environments (such as django 1.8).
This app do not use Database routers if have in settings. Please update :)
Just wondering if there is a reason for not allowing primary keys in models._parse_fields (line 188):
field_names = set(field.name for field in self.model._meta.fields if not field.primary_key)
...... What if the primary key is a custom CharField that you want to include in the FTS?
There is no management commands in version on pypi https://pypi.python.org/pypi/djorm-ext-pgfulltext
Must it be there?
It would be nice if it were possible to allow arbitrary queries to be used for building the search index on each field. For example, when indexing people's names, I'd like to be able to use a query like this:
objects = pg.SearchManager(
fields={
"name": """
setweight(to_tsvector(name), 'A') ||
setweight(to_tsvector(case
when unaccent(name) = name then ''
else unaccent(name)
end), 'B')
""",
}
)
It looks like this would be straight forward to implement.
Would you consider such a patch?
I have got this model
class Crag(models.Model):
name = models.CharField('name', max_length=64, default='', db_index=True)
description = models.TextField('description', default='', blank=True)
search_index = VectorField()
objects = SearchManager(
fields = ('name', 'description'),
config = 'pg_catalog.english', # this is default
search_field = 'search_index', # this is default
auto_update_search_field = True
My db has already been prepopulated when I start using this extension so I inited search index from python shell with this: Crag.objects.update_search_field()
.
The problem is that searching of accented names by terms without accents does not work. For instance, I have got a crag with name Cรฉรผse and it is not returned when using: Crag.objects.search("ce:*", raw=True) .
Description field is empty for Cรฉรผse (it is empty for all my crags atm btw) and the resulting value in search_index column is 'cรฉรผse':1
.
I have noticed that if I remove description from fields in SearchManager's definition and keep only name (and reset the index), the value in search_index is 'ceus':2 'cรฉรผse':1
and unaccented search works.
I might be doing something wrong though.
Since django 1.6, the orm submits all sql queries when they are called.
I didn't have any problems when working with the model, that has search field, outside of admin, but in admin, django enforces atomic validation and saving the model would fail with Transaction management error
Traceback (most recent call last):
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\contrib\staticfiles\handlers.py", line 67, in __call__
return self.application(environ, start_response)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\core\handlers\wsgi.py", line 206, in __call__
response = self.get_response(request)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\core\handlers\base.py", line 196, in get_response
response = self.handle_uncaught_exception(request, resolver, sys.exc_info())
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\core\handlers\base.py", line 231, in handle_uncaught_exception
return debug.technical_500_response(request, *exc_info)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django_extensions\management\technical_response.py", line 5, in null_technical_500_response
six.reraise(exc_type, exc_value, tb)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\core\handlers\base.py", line 114, in get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\contrib\admin\options.py", line 430, in wrapper
return self.admin_site.admin_view(view)(*args, **kwargs)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\utils\decorators.py", line 99, in _wrapped_view
response = view_func(request, *args, **kwargs)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\views\decorators\cache.py", line 52, in _wrapped_view_func
response = view_func(request, *args, **kwargs)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\contrib\admin\sites.py", line 198, in inner
return view(request, *args, **kwargs)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\utils\decorators.py", line 29, in _wrapper
return bound_func(*args, **kwargs)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\utils\decorators.py", line 99, in _wrapped_view
response = view_func(request, *args, **kwargs)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\utils\decorators.py", line 25, in bound_func
return func(self, *args2, **kwargs2)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\db\transaction.py", line 339, in inner
return func(*args, **kwargs)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\contrib\admin\options.py", line 1228, in change_view
self.save_model(request, new_object, form, True)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\contrib\admin\options.py", line 858, in save_model
obj.save()
File "C:\Development\fanwaze\ArenaWaze\src\arena\fansite\models.py", line 72, in save
super(FanSite, self).save(*args, **kwargs)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\db\models\base.py", line 545, in save
force_update=force_update, update_fields=update_fields)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\db\models\base.py", line 582, in save_base
update_fields=update_fields, raw=raw, using=using)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\dispatch\dispatcher.py", line 185, in send
response = receiver(signal=self, sender=sender, **named)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\djorm_pgfulltext\models.py", line 14, in auto_update_search_field_handler
instance.update_search_field()
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\djorm_pgfulltext\models.py", line 75, in update_search_field
self._fts_manager.update_search_field(pk=self.pk, using=using, config=config)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\djorm_pgfulltext\models.py", line 196, in update_search_field
transaction.enter_transaction_management(using=using)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\db\transaction.py", line 70, in enter_transaction_management
get_connection(using).enter_transaction_management(managed, forced)
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\db\backends\__init__.py", line 280, in enter_transaction_management
self.validate_no_atomic_block()
File "C:\Development\fanwaze\ArenaWaze\venv\lib\site-packages\django\db\backends\__init__.py", line 360, in validate_no_atomic_block
"This is forbidden when an 'atomic' block is active.")
TransactionManagementError: This is forbidden when an 'atomic' block is active.
This error lead me to think that something was trying to manage the transaction and/or open new transaction, while the current block is active.
I took the liberty of diving into the code, changing update_search_field() in SearchManagerMixin, fixed this issue. These are the changes i introduced:
class SearchManagerMixIn(object):
def update_search_field(self, pk=None, config=None, using=None):
'''
Update the search_field of one instance, or a list of instances, or
all instances in the table (pk is one key, a list of keys or none).
If there is no search_field, this function does nothing.
'''
if not self.search_field:
return
if not config:
config = self.config
if using is None:
using = self.db
connection = connections[using]
qn = connection.ops.quote_name
where_sql = ''
params = []
if pk is not None:
if isinstance(pk, (list, tuple)):
params = pk
else:
params = [pk]
where_sql = "WHERE %s IN (%s)" % (
qn(self.model._meta.pk.column),
','.join(repeat("%s", len(params)))
)
search_vector = self._get_search_vector(config, using)
sql = "UPDATE %s SET %s = %s %s;" % (
qn(self.model._meta.db_table),
qn(self.search_field),
search_vector,
where_sql
)
# if not transaction.is_managed(using=using):
# transaction.enter_transaction_management(using=using)
# forced_managed = True
# else:
# forced_managed = False
cursor = connection.cursor()
cursor.execute(sql, params)
# try:
# if forced_managed:
# transaction.commit(using=using)
# else:
# transaction.commit_unless_managed(using=using)
# finally:
# if forced_managed:
# transaction.leave_transaction_management(using=using)
I am not an expert on transaction management, but doing this made it all work in django admin. I also checked, and indeed the search_field is being updated as it should.
I didn't send you a pull request with this, as i am not 100% sure this is the proper way to solve this.
What do you think?
I have a Product model with a foreign key to an User model. I use the SearchManager in Product but I need to change the pg_catalog based on the User language.
Do you have any tips about the best way to do that?
Thank you.
Hello,
I just upgraded djorm-ext-pgfulltext to version 0.9.3 and got errors when use russian chars on my search query.
In version 0.9.2 it was okay.
Raising error:
Traceback (most recent call last):
File "/site-packages/django/core/handlers/base.py", line 111, in get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "./apps/utils/views.py", line 98, in autocomplete
'type': 'Hotels'} for o in hqs])
File "//local/lib/python2.7/site-packages/django/db/models/query.py", line 141, in __iter__
self._fetch_all()
File "/home/bla/local/lib/python2.7/site-packages/django/db/models/query.py", line 966, in _fetch_all
self._result_cache = list(self.iterator())
File "/home/bla/local/lib/python2.7/site-packages/django/db/models/query.py", line 265, in iterator
for row in compiler.results_iter():
File "/home/bla/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 700, in results_iter
for rows in self.execute_sql(MULTI):
File "/home/bla/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 775, in execute_sql
sql, params = self.as_sql()
File "/home/bla/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 109, in as_sql
where, w_params = self.compile(self.query.where)
File "/home/bla/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 80, in compile
return node.as_sql(self, self.connection)
File "/home/bla/local/lib/python2.7/site-packages/django/db/models/sql/where.py", line 106, in as_sql
sql, params = qn.compile(child)
File "/home/bla/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 80, in compile
return node.as_sql(self, self.connection)
File "/home/bla/local/lib/python2.7/site-packages/djorm_pgfulltext/fields.py", line 87, in as_sql
rest = (" & ".join(self.transform.__call__(rhs_params)),)
File "/home/bla/local/lib/python2.7/site-packages/djorm_pgfulltext/fields.py", line 139, in transform
return startswith(*args)
File "/home/bla/local/lib/python2.7/site-packages/djorm_pgfulltext/fields.py", line 61, in startswith
return [x + ":*" for x in quotes(wordlist)]
File "/home/bla/local/lib/python2.7/site-packages/djorm_pgfulltext/fields.py", line 58, in quotes
return ["%s" % adapt(x.replace("\\", "")) for x in wordlist]
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 0-7: ordinal not in range(256)
If your model has:
objects = SearchManager(fields=None, search_field=None)
so you can do searches with
table.objects.search('text to search', fields=('field1', 'field2'))
then the djorm_pgfulltext/models.py will throw an exception from within SearchQuerySet's search() method at line 285:
full_search_field = "%s.%s" % (
qn(self.model._meta.db_table),
qn(self.manager.search_field)
)
because self.manager.search_field is None so qn throws the exception.
The fix for this is simple, move that block of code down a few lines to be where it is actually used:
# if fields is passed, obtain a vector expression with
# these fields. In other case, intent use of search_field if
# exists.
if fields:
search_vector = self.manager._get_search_vector(config, using, fields=fields)
else:
**full_search_field **= "%s.%s" % (
qn(self.model._meta.db_table),
qn(self.manager.search_field)
)
if not self.manager.search_field:
raise ValueError("search_field is not specified")
search_vector = **full_search_field**
We will probably fork the code locally for this fix, but will undo our fork once its fixed in this or some other fashion.
Thank you for writing this package, we really appreciate it,
-herb
The following code throws an error:
TypeError: <lambda>() got an unexpected keyword argument 'config'
@receiver(post_save, sender=Ad)
def update_search(sender, instance, *args, **kwargs):
mapping = {
'en': 'english',
'fr': 'french',
}
config = mapping[instance.language]
instance.update_search_field(config=config)
rightly so because the definition for the contribute to class looks like:
# Add 'update_search_field' instance method, that calls manager's update_search_field.
if not getattr(cls, 'update_search_field', None):
_update_search_field = lambda x: x._fts_manager.update_search_field(pk=x.pk)
setattr(cls, 'update_search_field', _update_search_field)
The contribute to class should have more parameters on it so that the config can be passed in.
I am going to try and access the manager directly but it would be nice to be able to wire up my own post event that will change the dictionary.
Thanks!
It is said "The fields parameter is optional. If a list of tuples, you can specify the ranking of each field, if it is None, it gets 'A' as the default."
I've tried:
objects = SearchManager(
...
fields = (('title', 'A'), ('description', 'B'), ('content', 'C')),
...
)
And it fails with ValueError, because _parse_fields builds model-fields with (string, None) pattern and the difference is always == fields.
How do i specify rankings?
This is how the interesting part of traceback looks like:
File "C:\Users\dotz\Desktop\django-bpp\bpp\models\__init__.py", line 5, in <module>
from bpp.models.struktura import *
File "C:\Users\dotz\Desktop\django-bpp\bpp\models\struktura.py", line 12, in <module>
from djorm_pgfulltext.models import SearchManager
File "C:\Python27\lib\site-packages\djorm_pgfulltext\models.py", line 6, in <module>
from django.contrib.gis.db.models import GeoManager
File "C:\Python27\lib\site-packages\django\contrib\gis\db\models\__init__.py", line 8, in <module>
from django.contrib.gis.db.models.manager import GeoManager
File "C:\Python27\lib\site-packages\django\contrib\gis\db\models\manager.py", line 2, in <module>
from django.contrib.gis.db.models.query import GeoQuerySet
File "C:\Python27\lib\site-packages\django\contrib\gis\db\models\query.py", line 6, in <module>
from django.contrib.gis.db.models.fields import get_srid_info, PointField, LineStringField
File "C:\Python27\lib\site-packages\django\contrib\gis\db\models\fields.py", line 4, in <module>
from django.contrib.gis import forms
File "C:\Python27\lib\site-packages\django\contrib\gis\forms\__init__.py", line 2, in <module>
from .fields import (GeometryField, GeometryCollectionField, PointField,
File "C:\Python27\lib\site-packages\django\contrib\gis\forms\fields.py", line 11, in <module>
from django.contrib.gis.geos import GEOSException, GEOSGeometry, fromstr
ImportError: cannot import name GEOSException
I suggest the import is moved somewhere else so that projects that don't use GIS (like my own) won't pull this.
At least on Postgres 9.2, the VectorField
is created with a btree
index, which can't be (as far as I can tell) used for optimizing full text search queries.
It doesn't look like this can be addressed at the Django level (their function for creating field indexes looks pretty hard coded), so maybe it would be good to have a method like SearchManager.create_search_field_index()
? And bonus points for something sensible vis-a-vi South integration?
It would be very nice to be able to still use a "fake" textsearch as fallback for cases where your database is not postgresql. This would be useful during tests.
I have a package that runs some unittests on a Django model with a VectorField. Since Django runs its unittests with an in-memory Sqlite3 database that doesn't support VectorField, it throws the error:
File "/usr/local/myproject/.env/local/lib/python2.7/site-packages/django/db/models/base.py", line 546, in save
force_update=force_update, update_fields=update_fields)
File "/usr/local/myproject/.env/local/lib/python2.7/site-packages/django/db/models/base.py", line 664, in save_base
update_fields=update_fields, raw=raw, using=using)
File "/usr/local/myproject/.env/local/lib/python2.7/site-packages/django/dispatch/dispatcher.py", line 170, in send
response = receiver(signal=self, sender=sender, **named)
File "/usr/local/myproject/.env/local/lib/python2.7/site-packages/djorm_pgfulltext/models.py", line 47, in auto_update_search_field_handler
instance.update_search_field()
File "/usr/local/myproject/.env/local/lib/python2.7/site-packages/djorm_pgfulltext/models.py", line 111, in update_search_field
self._fts_manager.update_search_field(pk=self.pk, using=using, config=config)
File "/usr/local/myproject/.env/local/lib/python2.7/site-packages/djorm_pgfulltext/models.py", line 169, in update_search_field
cursor.execute(sql, params)
File "/usr/local/myproject/.env/local/lib/python2.7/site-packages/django/db/backends/sqlite3/base.py", line 366, in execute
six.reraise(utils.DatabaseError, utils.DatabaseError(*tuple(e.args)), sys.exc_info()[2])
File "/usr/local/myproject/.env/local/lib/python2.7/site-packages/django/db/backends/sqlite3/base.py", line 362, in execute
return Database.Cursor.execute(self, query, params)
DatabaseError: no such function: to_tsvector
How would I work around this, just so my unittests will run? I'm not testing the full text search. I just need it to not break everything in Sqlite. Is there someway to make the field revert to a simple CharField just for Sqlite?
Is it possible to use the fulltext index with the django double underscore syntax:
If you have a ParentModel which has a FK to the Page model, and I want to search for
all ParentModels which have a Page which contains a word it would be great if you could use this
syntax:
ParentModel.objects.filter(page__search_index="foo")
I guess other people have this question, too.
Please update the docs.
I installed 0.9.3 following the instructions in the README in an app using Django 1.5 with a PostgreSQL 9.3 backend.
However, when I ran ./manage update_search_field myapp mymodel
it runs without error but nothing gets populated into the new search_index
column on mymodel
.
One possible problem may be that mymodel exists on a non-default database, and the command update_search_field
doesn't support a --database
attribute like many commands do. However, even when I patch the command to use the correct database connection, still nothing populates.
This is the SQL being run by the command:
UPDATE "myapp_mymodel" SET "search_index" = setweight(to_tsvector('pg_catalog.english', coalesce("myapp_mymodel"."account_number", '')), 'D');
If I run:
SELECT account_number FROM myapp_mymodel;
I see a non-blank value for every record, but when I run:
SELECT search_index FROM myapp_mymodel;
it returns nothing but blank values.
If I manually run that UPDATE statement, it populates correctly, so I'm assuming I didn't properly patch the code to use the correct connection.
What am I doing wrong?
please help me using it for vietnamese language
thanks you very much !!!
The reason I was getting zero results was because existing database records have the field search_index empty. I still don't know how to index old existing records.
Page.objects.search("page", raw=True)
[]
Page.objects.search("")
[<Page: Page: Home page>, <Page: Page: About>, <Page: Page: Navigation>]
ie.:
class Product(models.Model):
name = models.CharField(u'Name', max_length=100, null=False)
slug = models.SlugField(u'Slug', max_length=100, null=False, blank=True)
description = models.TextField(u'Description', null=True)
search_index = VectorField()
objects = SearchManager(
fields=('name', 'description')
)
class StoreProduct(models.Model):
price = models.DecimalField(u'Price', max_digits=10, decimal_places=2)
office = models.ForeignKey('stores.Office')
product = models.ForeignKey(Product)
in_stock = models.BooleanField(u'in stock', default=True)
One office can have a lot of products, and in this case i have to find products on store and ideal world would be:
StoreProduct.objects.search('test', fields=['product__name', 'product__description']).filter(in_stock=True)
Is there any way to do that?
I am trying to use a foriegn key in the' fields=()' list, but I am not having success in the search finding results with it as it should. Any suggestions? The field of the model is called irc_name and I have tried 'irc_name' and 'irc_name__irc_name' which is also the field name of the referenced FK object field.
https://github.com/niwibe/djorm-ext-pgfulltext/blob/master/djorm_pgfulltext/models.py#L14
maybe, there should be
instance.update_search_field(pk=instance.pk)
?
Hi,
could you please inform where exactly in the code unaccent is being used?
I've installed the unaccent extension in my DB, but it seems search queries are still considering accents.
This warning results when using djorm-ext-pgfulltext
version 0.10
with Django version 1.7
:
[my-virtualenv-location]/lib/python2.7/site-packages/djorm_pgfulltext/models.py:323:
RemovedInDjango18Warning:
`SearchManagerMixIn.get_query_set` method should be renamed `get_queryset`.
which basically simply requires renaming that function.
P.S. The warning actually consists of just one line but I split it up for readability.
Is it possible to have multiple search managers with a model?
By this example it does not seem to work.
search_manger_en = SearchManager(
fields=('title', 'body'),
config='pg_catalog.english',
search_field='search_index_en',
auto_update_search_field=True
)
search_manger_fr = SearchManager(
fields=('title_fr, 'body_fr'),
config='pg_catalog.english',
search_field='search_index_fr',
auto_update_search_field=True
)
This tool appears to be intended only for the back end. If so, is it fully compatible with a different full text search for end users?
Default search_index created using btree:
CREATE INDEX tablename_a71a185f
ON "tablename"
USING btree
(search_index);
Shouldn't it use gin or gist instead? Is it possible to configure vector field to use them, or only further migration altering it will work?
P.S. Maybe issue tracker is not the perfect place for such discussions, but it might be useful for community if I create pull request extending readme with answer.
Why are 0.9.2 and 0.9.3 later than 0.10 and incorporate more stuff?
Any chance latest can be released as a version on PyPI?
Hi. The in-source comments say:
"You can also give a 'search_field', a VectorField into where the values of the searched
fields are copied and normalized. If you give it, the searches will be made on this
field; if not, they will be made directly in the searched fields."
If i understand the concept correctly, search_field is an aux vector/array column with all searchable values copied in and kept up-to-date with update_search_field.
I cannot keep them up to date this way because someone else is writing to db. My django is readonly.
But the SearchQuerySet does simply this:
full_search_field = "%s.%s" % (
qn(self.model._meta.db_table),
qn(self.manager.search_field)
)
w/o ensuring search_field exists and I am seeing a callstack.
Can u confirm my diagnosis is correct? I may be able to contribute as i need this kind of function.
I believe I may have found an issue when using multiple databases. When calling search() I was getting: ProgrammingError: column does not exist. I printed the query from the queryset and ran the sql in directly using psql where it worked as expected.
I tried using a django cursor in shell and forgot to select the right cursor for my database and got the same error so I went poking around the module to see why it wouldn't be using the proper database for the model.
After changing:
def get_query_set(self):
return SearchQuerySet(model=self.model, using=self._db)
to
def get_query_set(self):
return SearchQuerySet(model=self.model, using=self.db)
I'm not sure this is the correct solution, but it seems to be working now. Are there any side effects to this change I should be worrying about?
Hi I am a postgres newbie. Is there a mailing list or irc channel to discus this extension.
Does this extension add triggers to postgres to handle updates or is that handled elsewhere within python code?
I have a model with "title", "post" , "tags_array" and I created the ts_vector type using south. But I messed up my migrations . I was wondering if it is possible to know which fields went into a ts_vector? i.e is there a way using psql command line to tell whether the ts_vector has only "title" and "post" or whether it also includes "tags_array".
Sorry to send these questions in as issues and please let me know where I can ask them instead.
Error when running against django 1.9:
python manage.py update_search_field APP_NAME
Error
Traceback (most recent call last):
File "manage.py", line 59, in <module>
execute_from_command_line(sys.argv)
File "/python3.4/site-packages/django/core/management/__init__.py", line 353, in execute_from_command_line
utility.execute()
File "/python3.4/site-packages/django/core/management/__init__.py", line 345, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/python3.4/site-packages/django/core/management/base.py", line 348, in run_from_argv
self.execute(*args, **cmd_options)
File "/python3.4/site-packages/django/core/management/base.py", line 399, in execute
output = self.handle(*args, **options)
File "/python3.4/site-packages/djorm_pgfulltext/management/commands/update_search_field.py", line 21, in handle
app_module = models.get_app(app)
AttributeError: 'module' object has no attribute 'get_app'
@linuxlewis, are you still actively maintaining this repo? If not, I would recommend moving it to Jazzband so the community can continue to help support the project.
This is definitely related to #18
In testing with a bunch of sample documents and converting their contents with to_tsvector()
into a VectorField
I saw the following error:
OperationalError: index row size 2872 exceeds maximum 2712 for index "blabla_tsv"
HINT: Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text indexing.
db_index=True in the VectorField class results in Django trying to create a "normal" index on the tsvector, instead of GIN or GIST.
Suggestions for work-arounds are most welcome!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.