daevaorn / djapian Goto Github PK
View Code? Open in Web Editor NEWHigh level Xapian integration for Django
License: Other
High level Xapian integration for Django
License: Other
What steps will reproduce the problem?
1. create a new application
2. Add index to a model
3. in a shell enter ModelName.indexer.update()
What is the expected output? What do you see instead?
It should run the index.
What version of the product are you using? On what operating system?
Django SVN 1.0 on Ubuntu hardy.
Djapian SVN.
Please provide any additional information below.
{{{
In [1]: from saddle.models import Saddle
In [2]: Saddle.indexer
Out[2]: <djapian.indexer.Indexer object at 0x10c7cd0>
In [3]: Saddle.indexer.update()
---------------------------------------------------------------------------
<type 'exceptions.AttributeError'> Traceback (most recent call last)
/home/cwdusedsaddle/cwdusedsaddle/<ipython console> in <module>()
/usr/lib/python2.5/site-packages/djapian/indexer.py in update(self, documents)
222 # Open Xapian Database
223 database = xapian.WritableDatabase(
--> 224 self.get_full_database_path(),
225 xapian.DB_CREATE_OR_OPEN,
226 )
/usr/lib/python2.5/site-packages/djapian/indexer.py in
get_full_database_path(self)
408
409 def get_full_database_path(self):
--> 410 path = os.path.join(settings.DJAPIAN_DATABASE_PATH, self.path)
411 try:
412 os.makedirs(path)
/usr/lib/python2.5/site-packages/django/conf/__init__.py in
__getattr__(self, name)
30 # Used to implement dir(obj), for example.
31 return self._target.get_all_members()
---> 32 return getattr(self._target, name)
33
34 def __setattr__(self, name, value):
<type 'exceptions.AttributeError'>: 'Settings' object has no attribute
'DJAPIAN_DATABASE_PATH'
}}}
Original issue reported on code.google.com by [email protected]
on 11 Nov 2008 at 5:59
I get this error when getting an item from a XapianResultSet:
NameError: global name 'hit' is not defined
On line 243 of djapian/backend/xap.py,
return XapianHit(self._hits[pos], self._indexer, djapian_import(hit['model']))
should be:
return XapianHit(self._hits[pos], self._indexer,
djapian_import(self._hits[pos]['model']))
Original issue reported on code.google.com by [email protected]
on 18 Feb 2008 at 6:32
To allow user write only:
{{{
Indexer(model=MyModel)
}}}
to get djapian work
Original issue reported on code.google.com by daevaorn
on 12 Nov 2008 at 10:10
What steps will reproduce the problem?
1. create a new postgres data base, configure settings.py in the test dir
2. configure PYTHONPATH and DJANGO_SETTINGS_MODULE
3. $run_djapian.py
4. $python manage.py shell
5. In python shell type:
In [1]: from test_djapian.test.models import Test
In [2]: for x in xrange(10):
...: Test(title='Title test number %d'%x, content='content of test
number %d'%x).save()
6. run_djapian crashes with this exception:
In [3]: AttributeError: 'WritableDatabase' object has no attribute
'add_spelling'
AttributeError: 'WritableDatabase' object has no attribute 'add_spelling'
Traceback (most recent call last):
File "/home/alep/pythonenv/python/bin/run_djapian.py", line 91, in <module>
main()
File "/home/alep/pythonenv/python/bin/run_djapian.py", line 88, in main
update_changes(options.verbose, options.timeout)
File "/home/alep/pythonenv/python/bin/run_djapian.py", line 38, in
update_changes
index.update([src_obj])
File
"/home/alep/pythonenv/python/lib/python2.5/site-packages/djapian/backend/xap.py"
,
line 44, in update
self._process_attr_fields(row, doc)
File
"/home/alep/pythonenv/python/lib/python2.5/site-packages/djapian/backend/xap.py"
,
line 150, in _process_attr_fields
self.get_weight('.'.join((self.model._meta.object_name,name)), True) #
Weight
TypeError: in method 'Document_add_posting', argument 2 of type
'std::string const &'
for x in xrange(10):
Test(title='Title test number %d'%x, content='content of test number
%d'%x).save()
What is the expected output? What do you see instead?
No output at all, the object should be save in the database
What version of the product are you using? On what operating system?
postgres 8.2
djapian 1.0.1
django 0.96.1
lastest xapian
debian linux
Please provide any additional information below.
I was running virtualenv for python, but that should not create any
problemas at all, there is no problem when running the django server.
Original issue reported on code.google.com by [email protected]
on 3 Mar 2008 at 5:19
The current Djapian was made in top of r5820, but the last version is 7k+,
we need update that
But look, we also need keep another version compatible at least with 0.96.1
version of Django
Original issue reported on code.google.com by [email protected]
on 16 Feb 2008 at 6:54
Inspired by
http://www.djangosnippets.org/snippets/231/
Original issue reported on code.google.com by daevaorn
on 12 Nov 2008 at 9:03
Allow to set multiples prefix is good for internacionalizated site, where
an american user search "title:stiod" and the brazilian user search for
"titulo"
We have 2 options:
* Allow i18n in the prefix (search-time only)
* Allow set the prefix as a tuple of prefix, and add only the first
position as term and the following setted as prefix to the first one
Original issue reported on code.google.com by [email protected]
on 12 Sep 2007 at 6:52
At the moment, with djapian(as well as postgres full text search), all I
can do is search for objects containing the search string.
What I am looking to achieve is to get a list of 10 most "similar" content
based on a given content.
In case of djapian it would match only if 2 posts were exactly same as each
other.
Hint : For wordpress they got http://rmarsh.com/plugins/similar-posts/
doing exactly what i want to achieve.
Original issue reported on code.google.com by [email protected]
on 21 Dec 2008 at 4:06
Allow the run_djapian receive another parameter to set the
DJANGO_SETTINGS_MODULE automaticaly
Original issue reported on code.google.com by [email protected]
on 29 Jan 2008 at 8:19
What steps will reproduce the problem?
1. Removeld 1.0 code
2. Updated to the new version
3. run : pyhon manage.py index --rebuild
{{{
Traceback (most recent call last):
File "manage.py", line 11, in <module>
execute_manager(settings)
File
"/home/aaloy/.virtualenvs/trunk/lib/python2.5/site-packages/django/core/manageme
nt/__init__.py",
line 350, in execute_manager
utility.execute()
File
"/home/aaloy/.virtualenvs/trunk/lib/python2.5/site-packages/django/core/manageme
nt/__init__.py",
line 295, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File
"/home/aaloy/.virtualenvs/trunk/lib/python2.5/site-packages/django/core/manageme
nt/base.py",
line 195, in run_from_argv
self.execute(*args, **options.__dict__)
File
"/home/aaloy/.virtualenvs/trunk/lib/python2.5/site-packages/django/core/manageme
nt/base.py",
line 222, in execute
output = self.handle(*args, **options)
File
"/home/aaloy/workspace/trespams/djapian/management/commands/index.py", line
94, in handle
rebuild(verbose)
File
"/home/aaloy/workspace/trespams/djapian/management/commands/index.py", line
66, in rebuild
indexer.update(after_index=after_index)
File "/home/aaloy/workspace/trespams/djapian/indexer.py", line 207, in update
generator.index_text(smart_unicode(value), field.weight, prefix)
TypeError: in method 'TermGenerator_index_text', argument 3 of type
'Xapian::termcount'
}}}
Please provide any additional information below.
The indexhell stats command also fails.
Original issue reported on code.google.com by [email protected]
on 8 Mar 2009 at 1:00
Attached is a patch that attempts to unite them all using PEP8
(http://www.python.org/dev/peps/pep-0008/) as a template.
Original issue reported on code.google.com by [email protected]
on 19 Sep 2008 at 11:51
Attachments:
There is a deprecated maxlength argument in Change model definition.
{{{
Index: models.py
===================================================================
--- models.py (revision 80)
+++ models.py (working copy)
@@ -3,7 +3,7 @@
class Change(models.Model):
# Model to be used, e.g. myproject.myapp.models.Model1
- model = models.CharField(maxlength=100, db_index=True)
+ model = models.CharField(max_length=100, db_index=True)
# The id of register, as in myproject.myapp.models.Model1
did = models.PositiveIntegerField()
# Define if this object was deleted of database
}}}
Original issue reported on code.google.com by [email protected]
on 15 Jul 2008 at 9:49
What steps will reproduce the problem?
1. launch python manage.py runserver in a djapian enabled project
What is the expected output? What do you see instead?
The ouput is:
"optparse.OptionConflictError: option --verbosity: conflicting option
string(s): --verbosity"
It seems the "verbosity" is already taken...
What version of the product are you using? On what operating system?
SVN revision 106
Original issue reported on code.google.com by [email protected]
on 20 Oct 2008 at 12:55
Would be nice to expose the location of the match (field + index in string) for
building result snippets
Original issue reported on code.google.com by [email protected]
on 19 Feb 2008 at 12:02
= Problem =
Currently you must nohup run_djapian to put it in background
= Solution =
Add 2 options:
--fork or -f to fork it in the beggining and put in background
--run-once or -o to run only once and die
Note that the run-once option must not allow 2 instances at the same time,
because we can open only one writable database in the same database.
Original issue reported on code.google.com by [email protected]
on 29 Jan 2008 at 6:32
Make ResultSet retrieve results from xapian lazy.
Original issue reported on code.google.com by daevaorn
on 12 Nov 2008 at 9:02
[deleted issue]
Fields like IntegerField, Date[Time]Field and BooleanField needs a care
with data type, e.g:
IntegerFields need to be indexed as:
'00000000000001'
'00000000000002'
..
'00000010023245'
DateTimeFields needs to be indexed with a default prefix ("D" is omega
compatible)
Boolean fields are commonly indexed to filter results
Original issue reported on code.google.com by [email protected]
on 30 Aug 2007 at 2:36
currently it is only possible to index direct fields of a model ..
it would be nice if i could pass a function reference to XapianIndexer
instead of just field names
i have attached a small example 'patch' on what i mean .. it would allow to
use the indexer like:
def get_category_name(post):
return post.category.name
get_category_name.name = 'category'
post_index = XapianIndexer(
....
{ 'subject': ('Post.subject', 20),
'category': get_category_name,
})
any thoughts ? - i would like to index ForeignKeys, or prepare the values
without creating an additional model (as described in IndexingManyModelsAtOnce)
Original issue reported on code.google.com by [email protected]
on 26 Apr 2008 at 11:05
Attachments:
What steps will reproduce the problem?
1. svn checkout http://djapian.googlecode.com/svn/trunk/ djapian-read-only
2. cd djapian-read-only/build/lib/djapian/management/
3. command subdirectory and files are missing.
What version of the product are you using? On what operating system?
Djapian revision 110
Original issue reported on code.google.com by [email protected]
on 8 Nov 2008 at 8:04
Add xapian transactions for write operations
Original issue reported on code.google.com by daevaorn
on 23 Feb 2009 at 11:52
It would be far more 'djangoish' than it is now. There is a fork which
implement such aproach to indexing. I'll try to backport it to Djapian.
Fork's page: http://webnewage.org/projects/p/django-xapian/
Original issue reported on code.google.com by [email protected]
on 15 Jul 2008 at 8:56
Method called 'add_weigth' instead of 'add_weight'
Original issue reported on code.google.com by [email protected]
on 15 Jul 2008 at 8:45
What steps will reproduce the problem?
1. python manage.py index --rebuild
What is the expected output? What do you see instead?
Traceback (most recent call last):
File "manage.py", line 11, in <module>
execute_manager(settings)
File "/usr/lib/python2.5/site-packages/django/core/management/__init__.py", line 340, in
execute_manager
utility.execute()
File "/usr/lib/python2.5/site-packages/django/core/management/__init__.py", line 295, in
execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/usr/lib/python2.5/site-packages/django/core/management/base.py", line 192, in
run_from_argv
self.execute(*args, **options.__dict__)
File "/usr/lib/python2.5/site-packages/django/core/management/base.py", line 219, in
execute
output = self.handle(*args, **options)
File "/usr/lib/python2.5/site-packages/djapian/management/commands/index.py", line 113,
in handle
rebuild(verbosity)
File "/usr/lib/python2.5/site-packages/djapian/management/commands/index.py", line 78, in
rebuild
utils.process_instance("add", obj)
TypeError: process_instance() takes exactly 3 arguments (2 given)
What version of the product are you using? On what operating system?
Linux Ubuntu 8.10, Djapian r113
Original issue reported on code.google.com by [email protected]
on 12 Nov 2008 at 12:33
Simply delete an entry that is indexed by Djapian, then apply the change
(from a 'delete' line in djapian_change) ... this raises the error:
File "manage.py", line 11, in <module>
execute_manager(settings)
File
"/usr/lib/python2.5/site-packages/django/core/management/__init__.py", line
340, in execute_manager
utility.execute()
File
"/usr/lib/python2.5/site-packages/django/core/management/__init__.py", line
295, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/usr/lib/python2.5/site-packages/django/core/management/base.py",
line 195, in run_from_argv
self.execute(*args, **options.__dict__)
File "/usr/lib/python2.5/site-packages/django/core/management/base.py",
line 222, in execute
output = self.handle(*args, **options)
File "[...]/djapian/management/commands/index.py", line 106, in handle
update_changes(verbosity, timeout, not daemonize)
File "[...]/djapian/management/commands/index.py", line 41, in update_changes
hash = change.process()
File "[...]/djapian/models.py", line 62, in process
return utils.process_instance(self.action, self.object)
File "[...]/djapian/utils.py", line 118, in process_instance
hash = "%s:%s" % (ContentType.objects.get_for_model(model),
File
"/usr/lib/python2.5/site-packages/django/contrib/contenttypes/models.py",
line 17, in get_for_model
opts = model._meta
AttributeError: type object 'NoneType' has no attribute '_meta'
It seems that the instance is None ... and thus the command
ContentType.objects.get_for_model(instance.__class__) fails ...
Original issue reported on code.google.com by [email protected]
on 22 Oct 2008 at 4:09
How to combine django results and djapian one ?
for example:
results1=Article.objects.filter(price<10.)
results2=Article.indexer.search('red trousers')
What is the right way to get "the red trousers with price<10.", keeping the
djapian results order ?
Original issue reported on code.google.com by [email protected]
on 14 Mar 2009 at 9:04
After a search you need to do a `get_object()` to access data from the
model, but is better do a `hit.object.title` instead `hit.get_object().title`
To do this, just create a property to `object`
Original issue reported on code.google.com by [email protected]
on 11 Sep 2007 at 1:49
This patch adds the possibility to search ranges in djapian.
Xapian can search ranges. Here's an example query:
"date:(01/05/04..01/06/07)". It returns the documents whose field "date"
are between the specified dates.
To enable range searches on a field, it must be passed to the constructor
of XapianIndexer.
Example.
Enable ranges searches on the "birthyear" field.
profile_index = XapianIndexer(
'/tmp/db_profile/',
Profile,
[ 'Profile.get_username' ],
{
'birthyear' : 'Profile.get_birthyear',
},
range_attributes={ 'birthyear' : int })
Drawbacks.
Right now, only numerical ranges are supported.
Original issue reported on code.google.com by [email protected]
on 17 Aug 2008 at 8:28
Attachments:
What steps will reproduce the problem?
1. python manage.py test
What is the expected output?
Ran 28 tests in 24.773s
OK
What do you see instead?
FAIL: test_result_count (djapian.tests.query._QueryTest_active:True)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/jnt/django/molnet/thirdparty/djapian/tests/query.py", line 12, in test_result_count
self.assertEqual(len(self.result), count)
AssertionError: 0 != 3
----------------------------------------------------------------------
Ran 28 tests in 24.044s
FAILED (failures=1)
Original issue reported on code.google.com by [email protected]
on 9 Apr 2009 at 5:56
Make same changes and refactor Text class
Original issue reported on code.google.com by daevaorn
on 12 Nov 2008 at 9:09
Proposed API:
{{{
>>> query foobar :10
}}}
Original issue reported on code.google.com by daevaorn
on 22 Feb 2009 at 11:25
When there is no xapian database inside DJAPIAN_DATABASE_PATH ... I get an
error from xapian saying the format of the database could not be found.
I had to change in indexer.py the line :
database = xapian.Database(self.get_full_database_path())
to
database =
xapian.WritableDatabase(self.get_full_database_path(),xapian.DB_CREATE_OR_OPEN)
To auto create the database ... not sure it's a bug (maybe I missed
something) ... but at least I'm able to use djapian not ;-)
Original issue reported on code.google.com by [email protected]
on 20 Oct 2008 at 4:59
Add generic view that handles search form submission and return results
Original issue reported on code.google.com by daevaorn
on 22 Feb 2009 at 10:23
What steps will reproduce the problem?
1. Start the daemon with --verbosity flag
2. Verify that it writes out that it found 0 changes
3. Change a model that is set up for indexing
4. Check mysql table and verify that there is now a row in djapian_change
5. Observe that the daemon is not picking it up. Changes.objects.all() still
returns an empty
result set.
What version of the product are you using? On what operating system?
Django 1.0 on Ubuntu 7.10 with MySQL using InnoDB tables
Please provide any additional information below.
I'm aware that it might not be a problem with Djapian but a problem with my
MySQL setup but
I've really been trying to figure out what it could be and haven't gotten
anywhere. Which is why
I'm reporting it here.
Original issue reported on code.google.com by [email protected]
on 8 Nov 2008 at 1:44
Allow specify stemming language for each query or use default
Original issue reported on code.google.com by daevaorn
on 18 Feb 2009 at 6:55
launch python manage.py index
I get an error because of the following code (in index.py)
def handle(self, verbosity=False, daemonize=False, timeout=10,
rebuild=False, *args, **options):
if daemonize: <<<<<<<<<
daemonize() <<<<<<<<<
if rebuild: <<<<<<<<<
rebuild(verbosity) <<<<<<<<<
else:
update_changes(verbosity, timeout, not daemonize)
As underlined with <<<<<<<<< function variables have the same name as
functions defined above ...
Original issue reported on code.google.com by [email protected]
on 20 Oct 2008 at 1:00
A word like "Bleach" will be indexed as "leach" with "B" prefix
This word must be lowered before index
Original issue reported on code.google.com by [email protected]
on 28 Aug 2007 at 4:21
Django 1.1 has already "verbosity" option and a conflict exception is
raised when "index" command is used.
Original issue reported on code.google.com by [email protected]
on 29 Jan 2009 at 12:42
The revision 8223 came with lots of changes in the signals system, and to
adapt to them we must declare a "**kwargs" argment to signal functions.
Original issue reported on code.google.com by [email protected]
on 6 Aug 2008 at 5:36
Implement stemming into Djapian. Make the stemming language configurable.
Original issue reported on code.google.com by [email protected]
on 6 Mar 2008 at 8:22
Associate the docid with the ID of a object is not a good idea, because
when you use tools like "copydatabase" those IDs are lost, so you have
"corrupted data"
Original issue reported on code.google.com by [email protected]
on 31 Oct 2007 at 6:16
Make possible to run `xapian-compact` through djapian index command with
--compact switch
Original issue reported on code.google.com by daevaorn
on 12 Nov 2008 at 9:30
What steps will reproduce the problem?
1. #models.py:
class Product(models.Model):
name = models.CharField(max_length=20)
active = models.BooleanField(default=True)
In Indexer instance, set trigger on "active" boolean (e.g., trigger=lambda
obj: obj.active)
2. Run "python manage.py index --rebuild" to build db
3. As expected, those products marked active=0 are not indexed.
4. Also as expected, change one of those inactive products to active and it
will be indexed.
5. UNEXPECTED: Change the same product back to active=0 and it still shows
up in results.
Even "--rebuild" doesn't change anything (separate issue?). Only deleting
the contents of the db directory and running --rebuild returns expected
results.
What version of the product are you using? On what operating system?
r133
OS X 10.4.11
Original issue reported on code.google.com by google%[email protected]
on 9 Dec 2008 at 2:13
Proposed API:
{{{
indexer = djapian.merge_indexers(Entry.indexer, Comment.indexer)
for row in indexer.query("foobar"):
if row.model == Entry:
print "Found entry: %s" % row.instance
else:
print "Found comment: %s" % row.instance
}}}
Original issue reported on code.google.com by daevaorn
on 18 Feb 2009 at 6:59
I think it can be eliminated because it is not needed.
Original issue reported on code.google.com by daevaorn
on 21 Nov 2008 at 4:22
If you want to index something from contrib, you have to modify the model to
add a 'index_model'
attribute, which isn't very flexible. I've re-worked my copy of djapian to use
a dict in settings which
provides the same functionality, but you are no longer tied to adding
'index_model' to each model
you wish to index. The setting looks something like this:
INDEXER_INDEXES = {
'test.test': 'test_djapian.test.models.TestIndex',
'comments.comment': 'test_djapian.indexes.CommentIndex',
}
Then a quick change to signals to create `index_model` from
`sender._meta.app_label` and
`sender._meta.module_name` was all that was needed
Original issue reported on code.google.com by [email protected]
on 19 Feb 2008 at 7:46
To prevent last-minute surprises for those looking for a Django indexing
solution, I think it would
be a good idea to mention that it's generally not possible to use Xapian
together with mod_python.
Mod_wsgi works, but only after adding WSGIApplicationGroup %{GLOBAL} to ones
Apache site
configuration file.
See http://trac.xapian.org/ticket/185 for more information.
Original issue reported on code.google.com by [email protected]
on 12 Nov 2008 at 12:42
Currently is not possible to set a weight to fields which has more priorit
(like Titles in blogs, and wikis)
To provide it, the `Fields` and `Attributes` given of
`djapian.backend.base.Indexer`, may accept lists of tuples as parameters,
the size of these tuples is 2, where the first item is the field and the
second is the weight, e.g.:
{{{
Test_index = XapianIndexer(
...
# Fields to index without a prefix
[('Test.title', 2), ('Test.content', 1)],
# Attributes with prefix, to search like Google "title:Foobar"
{
'title':('Test.title', 2),
}
)
}}}
When the Xapian backend see that the field is a tuple, it will get the
weight and set it in the `wdf` of posting/term
To backward compatibilty, when the field is not a tuple, `wdf` is setted to "1"
Original issue reported on code.google.com by [email protected]
on 5 Sep 2007 at 3:45
Allow filter results by specific values in the QuerySet-like style.
Original issue reported on code.google.com by daevaorn
on 23 Feb 2009 at 11:54
Stemmed prefixed terms should be generated as:
ZTAGfoo
not:
TAGZfoo
as is currently the case.
Patch attached
Original issue reported on code.google.com by [email protected]
on 8 Feb 2009 at 1:26
Attachments:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.