rustemt / bobo-browse Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/bobo-browse
Automatically exported from code.google.com/p/bobo-browse
What steps will reproduce the problem?
1. I added some filters to the query
What is the expected output? What do you see instead?
I am not sure if DocIdSet.iterator may return null... but that whats hapenned...
What version of the product are you using? On what operating system?
latest
Please provide any additional information below.
I added null check...
Original issue reported on code.google.com by [email protected]
on 26 Dec 2011 at 8:49
Add FacetHandler for filtering based on Latitude/Longitude + radius.
It might be useful to depend on Issue 4 to have an indexing helper mechanism.
Original issue reported on code.google.com by [email protected]
on 29 Mar 2009 at 6:12
FacetHandler.buildRandomAccessOrFilter breaks out the loop when
buildRandomAccessFilter inside
the loop returns null filter. This creates incorrect OR filter since it misses
the rest of the values. It
should simply skip that value and continue. No need to add EmptyFilter either.
Original issue reported on code.google.com by [email protected]
on 19 Jun 2009 at 10:02
This is causing problems when there is a merge between partitions.
Original issue reported on code.google.com by [email protected]
on 6 Aug 2009 at 3:43
When adding a BrowseSelection to a BrowseRequest, it would be convenient to
issue a warning if no facethandler was set for the field the BrowseSelection is
being defined on.
ie, issue a warning here:
http://code.google.com/p/bobo-browse/source/browse/trunk/bobo-browse/src/com/bro
wseengine/bobo/api/BrowseSelection.java#185
Original issue reported on code.google.com by [email protected]
on 12 Nov 2010 at 9:25
It would be very useful to have a web-based Luke-like tool for a browse index.
Original issue reported on code.google.com by [email protected]
on 29 Mar 2009 at 6:13
cardemo has lotsa hacking.
It would be good to revisit it and build on top of dwr.
Original issue reported on code.google.com by [email protected]
on 29 Mar 2009 at 6:30
What steps will reproduce the problem?
1. maxID is exact multiple of MAX_SLOTS
(In my case maxID=32678, MAX_SLOTS=1024)
In such case, _list size is 32
Lines 627,654 try to access _list[32]...
What is the expected output? What do you see instead?
Working facets :)
What version of the product are you using? On what operating system?
latest, windows
Please provide any additional information below.
in the initialize function, _list should be initialized
[(size + MAX_SLOTS) / MAX_SLOTS] instead if
[(size + MAX_SLOTS -1) / MAX_SLOTS]
Original issue reported on code.google.com by [email protected]
on 26 Dec 2011 at 8:46
What steps will reproduce the problem?
1. Create a Bobo BrowseRequest with a SortField specified that is not-faceted.
(default lucene field)
2.
3.
What is the expected output? What do you see instead?
Verify if the BrowseHits are indeed sorted in reverse fashion.
Hits are sorted in Ascending order (on the SortField)
What version of the product are you using? On what operating system?
This is OS independent. Issue is verified on 2-5.0-rc1
But the code path seems to be the same in the trunk version as well.
Please provide any additional information below.
Potential Bug fix listed below: (SortCollector.java)
private static DocComparatorSource getComparatorSource(Browsable
browser,SortField sf){
DocComparatorSource compSource = null;
if (SortField.FIELD_DOC.equals(sf)){
compSource = new DocIdDocComparatorSource();
}
else if (SortField.FIELD_SCORE.equals(sf)){
// we want to do reverse sorting regardless for relevance
compSource = new ReverseDocComparatorSource(new RelevanceDocComparatorSource());
}
else if (sf instanceof BoboCustomSortField){
BoboCustomSortField custField = (BoboCustomSortField)sf;
DocComparatorSource src = custField.getCustomComparatorSource();
assert src!=null;
compSource = src;
}
else{
Set<String> facetNames = browser.getFacetNames();
String sortName = sf.getField();
if (facetNames.contains(sortName)){
FacetHandler<?> handler = browser.getFacetHandler(sortName);
assert handler!=null;
// BUG BUG BUG
//return handler.getDocComparatorSource();
// FIX FIX FIX
compSource = handler.getDocComparatorSource();
}
else{ // default lucene field
// BUG BUG BUG
// return getNonFacetComparatorSource(sf);
// FIX FIX FIX
compSource = getNonFacetComparatorSource(sf);
}
}
boolean reverse = sf.getReverse();
if (reverse){
compSource = new ReverseDocComparatorSource(compSource);
}
compSource.setReverse(reverse);
return compSource;
}
Original issue reported on code.google.com by [email protected]
on 13 Oct 2010 at 9:47
int totalHits = result.getNumHits();
BrowseHit bh[] = result.getHits();
System.out.println(totalHits);
System.out.println(bh.length);
out put:
2
0
version:
lucene:2.4.1
bobo:2.5.0
Original issue reported on code.google.com by [email protected]
on 20 Apr 2010 at 3:12
usually the number of terms is low, representing with an int is wasteful.
A problem is that we don't know is when constructing FacetDataCache, we
iterate the terms table, at which point, we don't know the number of terms.
Original issue reported on code.google.com by [email protected]
on 17 Sep 2009 at 10:15
Issue placeholder for Yasuhiro's idea on improving performance of
CompactMultiValueFacetHandler
Original issue reported on code.google.com by [email protected]
on 28 Mar 2009 at 3:49
This is an umbrella ticket for tracking of lucene 2.9 upgrade.
Things to keep in mind:
Segment readers is now exposed, and by default, Lucene is loading (FieldCache)
and searching in
segments. BoboIndexReader being a FilteredIndexReader may not be correct
anymore, perhaps
BoboSegmentReader?
Original issue reported on code.google.com by [email protected]
on 27 Sep 2009 at 6:36
am using Solr 4.3.0 version and trying to implement Bobo-Browse plugin in Solr. But it is not implemented successfully in solr and following error is occurred,
msg=SolrCore 'collection1' is not available due to init failure:
org/apache/lucene/index/FilterIndexReader,trace=org.apache.solr.common.SolrExcep
tion: SolrCore 'collection1' is not available due to init failure:
org/apache/lucene/index/FilterIndexReader
I was placed jars at following locations,
kamikaze-1.0.7.jar lib
spring-2.5.5.jar lib
xstream-1.2.jar lib
fastutil-5.1.5.jar lib
bobo-browse-2.0.7.jar dist
bobo-solr-3.0.0.jar dist
And FilterIndexReader.class does not exits in Lucene-core- 4.3.0 jar that is
required to implementbobo-browse. How can i integrate bobo-browse with Solr
4.3.0 version?
Original issue reported on code.google.com by [email protected]
on 7 Jan 2014 at 7:00
create a multi reader for bobo index reader
Original issue reported on code.google.com by [email protected]
on 21 May 2009 at 10:42
What steps will reproduce the problem?
uses SegmentInfos.get(i) instead of SegmentInfos.info(i)
What version of the product are you using? On what operating system?
latest
Please provide any additional information below.
I modified to info(i)
Original issue reported on code.google.com by [email protected]
on 26 Dec 2011 at 8:51
make facet handlers be able to build Querys
Original issue reported on code.google.com by [email protected]
on 17 Jul 2009 at 4:00
What steps will reproduce the problem?
IndexReader reader = IndexReader.open( MyIndexer.getInstance().getDirectory(),
true);
BoboIndexReader boboReader = BoboIndexReader.getInstance(reader, facetHandlers);
// creating a browse request
BrowseRequest browserRequest = new BrowseRequest();
browserRequest.setCount(quant);
browserRequest.setOffset(start);
facetSpecs.setMaxCount(quant);
browserRequest.setFacetSpec(field, facetSpecs);
if (query!=null){
browserRequest.setQuery(query);
}
// perform browse
Browsable browser = new BoboBrowser(boboReader);
BrowseResult result = browser.browse(browserRequest);
What is the expected output? What do you see instead?
The setCount and setOffset methods working.
What version of the product are you using? On what operating system?
2.5.0
Win 7 JVM1.6
Lucene 3.1
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 11 Apr 2011 at 4:59
What steps will reproduce the problem?
1. please see the Java program I attached.
What is the expected output? What do you see instead?
expected output: content of each hits
instead: see nothing
What version of the product are you using? On what operating system?
1. bobo browse: bobo-browse-2.5.0-rc1.jar
2. lucene: lucene-core-3.0.3
3. OS:Windows 7 Professional 64bit
4. JDK: jdk-6u23-64bit
5. Eclipse: Version: Helios Service Release 1 Build id: 20100917-0705
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 20 Jul 2011 at 7:30
I have been struggling with some really weird issues with Bobo suddenly not
finding facet results based on SimpleFacetHandler.
Finally I realized that a regression bug was introduced recently in the
constructor:
public SimpleFacetHandler(String name,String indexFieldName,TermListFactory termListFactory)
{
this(name,name,termListFactory,null);
}
should be
public SimpleFacetHandler(String name,String indexFieldName,TermListFactory termListFactory)
{
this(name,indexFieldName,termListFactory,null);
}
Original issue reported on code.google.com by [email protected]
on 21 Sep 2010 at 12:29
您好!
首先我非常感谢您们共享出您们的成果。
现在我正在开发基于Lucene的搜索引擎,其中对于分组统计一块还是空白,而查询到您们开源的boboBrowse这个项目。我现在的项目采用最新版本的Lucene3.0.2和同步下来的最新版的paoding(庖丁解牛分词器),运行下面一段代码:
public class TestBoBo {
public static void main(String[] args) throws BrowseException, IOException,
ParseException{ // opening a lucene index
Directory idx = FSDirectory.open(new File("D:/DATAMANAGER/SEARCH/INDIVIDUAL"));
IndexReader reader = IndexReader.open(idx);
// decorate it with a bobo index reader
BoboIndexReader boboReader = BoboIndexReader.getInstance(reader);
// creating a browse request
BrowseRequest br=new BrowseRequest();
br.setCount(10);
br.setOffset(0);
// parse a query
QueryParser qp = new QueryParser(Version.LUCENE_CURRENT,"content",new
PaodingAnalyzer());
Query q=qp.parse("广");
br.setQuery(q);
// add the facet output specs
FacetSpec colorSpec = new FacetSpec();
colorSpec.setMaxCount(10);
colorSpec.setOrderBy(FacetSortSpec.OrderHitsDesc);
br.setFacetSpec("catCode",colorSpec);
// perform browse
Browsable browser=new BoboBrowser(boboReader);
BrowseResult result=browser.browse(br);
int totalHits = result.getNumHits();
BrowseHit[] hits = result.getHits();
Map<String,FacetAccessible> facetMap = result.getFacetMap();
FacetAccessible colorFacets = facetMap.get("catCode");
List facetVals = colorFacets.getFacets();
for(int i=0;i<facetVals.size();i++){ BrowseFacet o =
(BrowseFacet)facetVals.get(i);
System.out.println(o.getHitCount()+"-----"+o.getValue()); } } }
bobo.spring配置文件如下:
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd">
<bean id="catCode" class="com.browseengine.bobo.facets.impl.PathFacetHandler">
<constructor-arg value="catCode" />
<property name="separator" value="/" />
</bean>
<bean id="handlers" class="java.util.ArrayList">
<constructor-arg>
<list>
<ref bean="catCode" />
</list>
</constructor-arg>
</bean>
导入jar包到我的项目里运行可以看到结果为:
1058-----100003330001000100050001
1056-----100003330001000100050002
698-----10000333000100020004
385-----100003330001000200050001
217-----100003330001000300050002
157-----10000333000100030004
131-----100003330001000400050001
92-----10000333000100040004
78-----10000333000100010004
75-----100003330001000200050002
但是下载您网站上提供的源代码javasoze-bobo-03219a1.zip单独运行�
��试代码输出结果为:
4041-----
发现网上下载打的jar包的class文件与下载源代码编译后class文�
��并不完全相同,请问是不是javasoze-bobo-03219a1.zip源代码并不��
�bobo-browse-2.5.0-rc1.jar的源代码?
还想请教的是:
1、bobo.spring文件必须放在索引文件所在目录下吗?能否配置��
�classpath路径下?因为我的索引文件是分多目录存储,而且有�
��建索引的需求。
2、如我的分类编码是100003330001000200050002:这里每4位为一个分
类级别,
1000
10000333
100003330001
1000033300010002
10000333000100020005
100003330001000200050002
除了在path切面的方式采用分隔符如"/"来统计,可不可以设置�
��几位数做一个分隔来统计,而且我们这个分类是无限极生成
下去,所以并不需要对所有分类进行统计,可能只需要统计��
�面三四级,请问是否有接口可以根据自己的需求去自定义?
请问bobo这个项目是否有API或一些文档资料提供?
Original issue reported on code.google.com by [email protected]
on 16 Jul 2010 at 12:27
Extend BrowseRequest.addSelection(BrowseSelection sel) api to allow OR'd
selections:
e.g.
BrowseRequest.addSelection(BrowseSelection sel, Occur.SHOULD);
BrowseRequest.addSelection(BrowseSelection sel, Occur.MUST);
Internal logic may need to change due to this.
Original issue reported on code.google.com by [email protected]
on 28 Mar 2009 at 3:48
I am using bobo-browse to implement faceted search for a standalone
application that has an index containing 25 million documents. The fields
in each document contain multiple values. Some example fields are
"author", "institution", "country", etc.
Is there a way to do Multi Value Hierarchical Faceting? Currently, I am
retrieving the top 100 authors, and then I have to do 100 more searches to
retrieve the top institutions for each author.
It would be extremely useful to have a MultiValueHierarchcicalFacetHandler
class, where I can specify the hierarchy of facets.
Original issue reported on code.google.com by [email protected]
on 6 Oct 2009 at 9:06
These facet handlers do not support facet merging across different shards.
Original issue reported on code.google.com by [email protected]
on 29 Mar 2009 at 6:37
The current integration with Solr has Bobo specific inputs and
outputs (i.e. the HTTP parameters and the XML output are
incompatible with Solr's).
See: http://wiki.apache.org/solr/SimpleFacetParameters
Original issue reported on code.google.com by [email protected]
on 15 Jul 2009 at 8:33
您好,我程序里的lucene是2.3版本的,采用了bobo-brower2.0.6.jar
和bobo-
brower2.0.2.jar都会报如下错误:
Exception in thread "main" java.lang.VerifyError: class
com.browseengine.bobo.api.BoboIndexReader overrides final method
deleteDocument.(I)V
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(Unknown Source)
at java.security.SecureClassLoader.defineClass(Unknown Source)
at java.net.URLClassLoader.defineClass(Unknown Source)
at java.net.URLClassLoader.access$000(Unknown Source)
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClassInternal(Unknown Source)
at test.lucene.testBobo.main(testBobo.java:35)
请问是不是我的lucene版本太低了。
报错所在行:
// creating a browse request
BoboIndexReader boboReader = BoboIndexReader.getInstance(reader);
Original issue reported on code.google.com by [email protected]
on 30 Sep 2009 at 3:40
this happens when docs have value missing for a given field
Original issue reported on code.google.com by [email protected]
on 21 May 2009 at 10:40
* DocIdSetIterator.advance (replaces next and skipTo)
* Instead of using MultiSearcher, utilize Collector.setNextReader to access
FacetHandler caches. Otherwise near realtime search is inefficient because
MultiSearcher searches on small segments individually. See LUCENE-1483.
Original issue reported on code.google.com by [email protected]
on 16 Jul 2009 at 6:11
The number of paths returned in total currently is set by FacetSpec.maxCount
It should be per depth.
Original issue reported on code.google.com by [email protected]
on 10 Nov 2009 at 4:49
Fairly trivial, but if one uses the RangeFacetHandler constructor that takes in
an autoRange parameter, the
constructor defaults to true, and ignores the parameter value passed in:
public RangeFacetHandler(String name,String indexFieldName,TermListFactory termListFactory,boolean
autoRange)
{
super(name);
_dataCache = null;
_indexFieldName = indexFieldName;
_termListFactory = termListFactory;
_predefinedRanges = null;
_autoRange = true;
}
Original issue reported on code.google.com by [email protected]
on 19 Oct 2009 at 5:24
What steps will reproduce the problem?
1. Create an org.apache.lucene.index.ParallelReader with readers added as
appropriate.
2. Call BoboIndexReader.getInstance(...) with the newly created ParallelReader
and facetHandlers as appropriate.
What is the expected output? Lack of exception.
What do you see instead? 'java.lang.UnsupportedOperationException: This reader
does not support this method.'
What version of the product are you using? 2.5.0
On what operating system? OS X and Ubuntu
Please provide any additional information below.
This bug appears to be as a result of bug fix bobo-31
Original issue reported on code.google.com by [email protected]
on 30 Sep 2010 at 2:00
parser for a subset of sql select queries and mdx olap queries.
Original issue reported on code.google.com by [email protected]
on 29 Mar 2009 at 3:40
Build a performance benchmark suite.
And add it to a wiki.
Original issue reported on code.google.com by [email protected]
on 29 Mar 2009 at 3:44
issue used to track perf enhancement work
Original issue reported on code.google.com by [email protected]
on 25 Jun 2009 at 1:54
What steps will reproduce the problem?
1. Create a BrowseRequest with a MatchAllDocsQuery type query
2. convert the BrowseRequest into a BrowseRequestBPO.Request
3. convert the resulting BrowseRequestBPO.Request back to a BrowseRequest
We should expect to get a BrowseRequest back with MatchAllDocsQuery. But we get
back a
BrowseRequest with a TermQuery.
Original issue reported on code.google.com by [email protected]
on 16 Aug 2009 at 12:47
It might be interesting to have FacetHandlers to build lucene Field
objects. This would make indexing simple and consistent with search.
Original issue reported on code.google.com by [email protected]
on 29 Mar 2009 at 3:42
What steps will reproduce the problem?
1. index for example 3 docs(uid=1 field1=..., uid=2 field1=..., uid=3
field1=...);
2. flush to disk index;
3. delete them;
4. flush to disk index;
5. insert them again
6. flush to disk index;
7. search with MatchAllDocsQuery;
What is the expected output? What do you see instead?
* expected: all 3 docs will be returned;
* what i see: none of them are returned;
What version of the product are you using? On what operating system?
Version: trunk
OS: Debian
Please provide any additional information below.
This is caused by bugs in FastMatchAllDocsQuery.
After steps above, you will get 3 index file on disk: _0.cfs, _0_1.del and
_1.cfs
When FastMatchAllDocsQuery instance created by
BoboIndexReader.getFastMatchAllDocsQuery,
you will get deletedDocs=[0, 1, 2] and maxDoc=6.
And then, FastMatchAllDocsQuery.FastMatchAllDocsWeight.scorer will be
called twice for _0.cfs and _1.cfs.
The first call is good, but the second call:
_deletedDocs = [0, 1, 2] and _deletedIndex = 0 too,
but _deletedDocs should be null OR _deletedIndex be 3
because any of 3 docs in _1.cfs is not deleted.
The attached patch is my workaround.
Original issue reported on code.google.com by [email protected]
on 18 Sep 2009 at 7:26
Attachments:
For each request, if FacetSpec is asked, FacetHandlers returns a
FacetCountCollector. In it, contains
a count array (int[] with length maxdoc)
This can cause significant GC issues.
We should think about pooling these objects and at the same time be careful
with realtime case.
Original issue reported on code.google.com by [email protected]
on 16 Sep 2009 at 7:41
When placing a throughput load of approximately 15 queries per second we can
intermittently see a 'java.lang.IndexOutOfBoundsException'.
Upon further investigation it was found that the BoboSearcher2 class is using
an instance variable '_facetCollectors' which is not thread safe. This variable
is set by a call from the BoboSubBrowser.browse method which could be executed
via multiple threads simultaneously.
Changing this instance variable to utilise a ThreadLocal instead addresses this
problem.
Please find a patch file attached.
Original issue reported on code.google.com by [email protected]
on 1 Oct 2010 at 1:16
Attachments:
A simple anonymous checkout of the trunk code is not working
$ svn checkout http://bobo-browse.googlecode.com/svn/trunk/bobo-browse/
svn: Server sent unexpected return value (403 Forbidden) in response to
PROPFIND request for '/svn/trunk/bobo-browse'
Accessing with the Browser works fine. and also Accessing other SVN Repos
like SF works fine over our proxy.
what is the problem.
Original issue reported on code.google.com by [email protected]
on 26 Jan 2010 at 5:15
I am using GeoFacetHandler with the following distances: 1km, 2km, 5km, 10km
and 25km.
The problem is that FacetSortSpec.OrderValueAsc doesn't work, because the
facets will be sorted like this:
1km
10km
2km
25km
5km
Left-padding the distances with zero is a workaround but in this case you have
to remove the leading zeros on rendering.
Using FacetSortSpec.OrderHitsDesc solves the problem only if the facet numbers
are not equal:
All 16
25 km 16
10 km 3
1 km 2
2 km 2
5 km 2
A Custom Facet sorter could help but IMHO it can't be in the intention of the
inventer that the usage of the GeoFacetHandler requires a custom sorter...
What do you think?
Original issue reported on code.google.com by [email protected]
on 15 Jun 2010 at 2:36
Bug taken from bobo-browse group:
http://groups.google.com/group/bobo-browse/browse_thread/thread/b269f29fd9901209
What steps will reproduce the problem?
let's assume I have three offers in total:
offer1 with categories: 01000/01001 (culture/city tours)
offer2 with categories: 02000/02002 (Personal Services/Baby equipment)
offer3 with categories: 01000/01003 01000/01004 (culture/boat tours
and culture/historical tours)
What is the expected output? What do you see instead?
How many offers are in category culture? Of course 2.
How often does a culture category occur? 3 times.
We want the '2' but we're getting '3' ;-)
What version of the product are you using? On what operating system?
Bobo-browse from 2010-11-10
Lucene 3.0.2
Original issue reported on code.google.com by [email protected]
on 16 Nov 2010 at 4:16
For each request, if FacetSpec is asked, FacetHandlers returns a
FacetCountCollector. In it, contains
a count array (int[] with length maxdoc)
This can cause significant GC issues.
We should think about pooling these objects and at the same time be careful
with realtime case.
Original issue reported on code.google.com by [email protected]
on 16 Sep 2009 at 7:43
Maintenance of minID/maxID in facet loading code should be taken out of an
inner loop that reads
index data. It is not necessary to check min/max for every docid because docids
always come in
ascending order.
Original issue reported on code.google.com by [email protected]
on 22 Apr 2009 at 12:45
Currently these facet handlers throw exception when getScoreDocComparator is
called. We already
have good data structure in memory to support this.
Original issue reported on code.google.com by [email protected]
on 19 May 2009 at 1:08
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.