msaari / relevanssi Goto Github PK
View Code? Open in Web Editor NEWRelevanssi, a WordPress plugin to improve the search
License: GNU General Public License v3.0
Relevanssi, a WordPress plugin to improve the search
License: GNU General Public License v3.0
WP JV Post Reading Groups is not compatible with Relevanssi.
This should solve the issue:
add_filter( 'relevanssi_post_ok', 'rlv_jv_post_ok', 11, 2 ); function rlv_jv_post_ok( $post_ok, $post_id ) { if ( function_exists( 'wp_jv_prg_user_can_see_a_post' ) ) { $post_ok = wp_jv_prg_user_can_see_a_post( get_current_user_id, $post_id ); } return $post_ok; }
assume relevanssi table has millions record
in order to retrieve un-index post id . the query (see below) need to be execute each loop and it will perform slower and slower .
my solution is create another table that only record indexed post id (need modify code)and create a pk key for the column.
after that change left join relevanssi table to what I just created.
theory is quite simple. instead of join a big table why not just join small one.
$q = "SELECT post.ID
FROM $wpdb->posts post
LEFT JOIN $wpdb->posts parent ON (post.post_parent=parent.ID)
LEFT JOIN $relevanssi_table r ON (post.ID=r.doc)
Excerpts are not generated for AJAX searches.
Hi. Doing an options table performance analysis with https://kinsta.com/knowledgebase/wp-options-autoloaded-data/ I've discovered that relevanssi_words
is now the top sized (120 kB) option.
Is it required to autoload on every request, or could Relevanssi load it only within its usage context?
It seems that if you have an inline background-image rendered to your template, (with no image present), then searches get logged twice.
Steps to reproduce:
template-parts/post/content-excerpt.php
, add the line <div style="background-image:url();"></div>
so it will appear on the rendered pageFun Fact:
<div style="background-image:url(http:image.com);"></div>
This seems like it might even be a WordPress problem, but I wanted to make a mention of it here first as it is counting the logs in a Relevanssi created table.
since primary key already cover up doc column, threre is no need to create a independent for doc column.
for example
explain select * from wp_relevanssi where doc=12344
result
1 | SIMPLE | wp_relevanssi | ref | PRIMARY,docs | PRIMARY | 4 | const | 1 |
mysql use key PRIMARY, not key doc
Hello @msaari,
I am unable to "Index Unindexed Posts" in Relevanssi version 4.0.7, WordPress version 4.9.4.
When I visit the Indexing admin tab, and click the "Index Unindexed Posts" button, with Chrome's Inspector open, I can see the initial request sent to admin-ajax as "action=relevanssi_index_posts&completed=0&total=1686&offset=0&limit=10&extend=true"
. The response from admin-ajax is then a "403 Unauthorized"
response.
I tried doing a hard reload (multiple times), to ensure that I do not have a cached version of any admin JS scripts, and yet the issue persists.
In PhpStorm, I did a little tracing, and it appears that a check_ajax_referer check fails in lib/admin-ajax.php, in the relevanssi_index_posts_ajax_wrapper() function, around line 35: check_ajax_referer( 'relevanssi_indexing_nonce', 'security' );
I inspected the $_REQUEST object from inside relevanssi_index_posts_ajax_wrapper() and it was shaped as follows:
$_REQUEST = Array ( [action] => relevanssi_index_posts [completed] => 0 [total] => 1686 [offset] => 0 [limit] => 10 [extend] => true )
I do not see the "security" key (the second argument to check_ajax_referer) in the $_REQUEST object, which could explain why check_ajax_referer fails.
If I comment out the nonce check, then "Index Unindexed Posts" appears to behave as expected. Obviously commenting out the nonce check is not a good solution, so I dug a little deeper into where the AJAX request is made.
I ran console.log from multiple places in lib/admin_scripts.js, and I do not see the "security" key as anything but undefined.
When args are constructed for the first call to process_indexing_step at lib/admin_scripts.js, line 163, there is no indication that the nonce is being added with the "security" key to the request payload.
If I add 'security' : nonce.indexing_nonce
to the args array, at around line 170, then "Index Unindexed Posts" appears to behave as expected.
I am not entirely certain if this is directly related to other recent threads that similarly describe indexing failures on the WPORG forums, but it may be possible. I am also not certain if there are other AJAX calls in the plugin that may be affected by this. It does appear that this solves the issue for me.
With PHP8.1 I get an error in relevanssi_meta_query_from_query_vars()
On line 1226 in lib/search.php you set $meta_query to boolean false, instead of an empty array;
I have queries that passes through without populating $meta_query at all, thus generating the following error:
[12-Sep-2023 12:58:11 UTC] PHP Fatal error: Uncaught ErrorException: Automatic conversion of false to array is deprecated in /<path occluded>/plugins/relevanssi-premium/lib/search.php:1280
Since I have two million records on wp_relevanssi_log table.
I found that relevanssi_total_queries function will perform very badly on each query.
my solution is that
with add a index on time column . these query could change from
[SELECT COUNT(id) FROM $log_table WHERE TIMESTAMPDIFF(DAY, time, NOW()) <= 1;]
to
[SELECT COUNT(id) FROM $log_table WHERE TIMESTAMPDIFF(DAY, time, NOW()) <= 1 and and time >= date_sub(now() , interval 2 day );]
when relevanssi table has millions recode. analyze table will become slower and slower . especially on rebuild whole posts.
maybe just move the command right before complete ?
// To prevent empty indices.
$wpdb->query( "ANALYZE TABLE $relevanssi_table" ); // phpcs:ignore WordPress.DB.PreparedSQL.NotPrepared,WordPress.DB.PreparedSQL.InterpolatedNotPrepared
$complete = false;
$size = $indexing_query_args['size'];
if ( ( 0 === $size ) || ( count( $content ) < $size ) ) {
$complete = true;
$wpdb->query( "ANALYZE TABLE $relevanssi_table" );--> move to here
update_option( 'relevanssi_indexed', 'done', false );
// Update the document count variable.
relevanssi_async_update_doc_count();
}
If Relevanssi is enabled in the admin searches, searching for media in the Media Library grid view is broken. While I'm figuring out what's up with this, there are two solutions to this problem:
If custom fields are set to index "visible" or "all", this will include ct_builder_shortcodes_revisions
, which will mean old content will be indexed and used in excerpts. Relevanssi needs to make sure this shortcode is always excluded.
Hi there,
We are currently testing your Plugin's free version and I have to say - We are impressed!
We consider buying the premium lifetime license but there is one requirement missing in the current state of the plugin - compatibility with the Elementor Pro Posts Widget.
More specifically the custom excerpt feature is not compatible with the excerpt displayed in the Posts widget. If I understand correctly you are storing the excerpt in the main queries WP_Query
object in the excerpt
/ post_excerpt
attribute which should be output upon invoking the_excerpt
. Elementor does it the recommended way though still it does not work.
Maybe I am missing something and I am unsure if this is the right place for such an issue as it may possibly be caused by Elementor (I have opened an issue on the Elementor Github as well https://github.com/orgs/elementor/discussions/27316).
If any further information is required on the topic I am happy to oblige so feel free to discuss this matter at any time.
Thanks for your time.
We had a few incidents over the last week where empty search queries started tying up the database, with Relevanssi search query only one visible in MariaDB 10.3 process list.
I am also able to reproduce a very ineffective DB query manually: query like https://<site>/?s
with no parameter value takes 5 seconds to complete.
There seems to be a "Redirect" feature in recent Relevanssi (Premium?), for "empty search terms", but based on the time it takes to redirect, it also performs this slow query first, then redirects.
PS we only display 10 results at a time, with numbered paging - should I configured Relevanssi search throttle to 10? It seems to make no sense to return 500 rows, to discard 490 every time 🤔
Hi there - first of all, thanks for an immensely helpful plug-in. Love it.
That said, I have a small bug to report. Not sure if this really is Relevanssi's doing, but I think so.
Problem
Whenever "highlight in documents" is enabled, any code blocks on the page break (
is not parsed properly).
With this configuration:
![image](https://user-images.githubusercontent.com/4086860/131980196-d0642778-008f-4fb2-90fc-f4d5e9c3e510.png)
A page with "highlight" in URL will look like this:
![image](https://user-images.githubusercontent.com/4086860/131980243-0e3a91b1-733c-4659-bda1-7bc19680c934.png)
Whenever either highlighting is disabled or the URL parameter removed, it works properly:
![image](https://user-images.githubusercontent.com/4086860/131980292-ec2d1592-a553-4124-bdd1-fee7fa2a1465.png)
Any thoughts?
Come across this bug today,
If you use a WP_Query in combination with the relevanssi_do_query function the resulting items will be a object of the type stdClass instead of the type WP_post.
This bug only happens in the premium version, the free version behaves like expected.
Premium version: 2.23.0
Free version: 4.20.0
PHP: 8.0
Example code
$args = [
's' => $query,
'post_type' => isset($_GET['view']) && $_GET['view'] === 'product' ? 'product' : 'any',
'post_status' => 'publish',
'posts_per_page' => -1,
'orderby' => 'date',
'order' => 'DESC',
];
$query = new \WP_Query();
$query->parse_query($args);
$items = relevanssi_do_query($query);
Expected: $items should be an array of WP_Post objects (if there are results)
Result: $items is an array of stdClass
Hello,
I'm trying to exclude some Gutenberg blocks from Relevanssi indexing by adding a filter with the following code :
add_filter('relevanssi_render_block', 'exclude_block_from_search_render');
function exclude_block_from_search_render($block) {
error_log('TEST RENDER_BLOCK');
}
The problem is that this function is never called during searches. I've also tried the "relevanssi_hits_filter" filter, which is called, but that's not the one I need.
Is this a premium feature? If not, how can I correct this error?
Thanks for your help
First of all, thanks for the great plugin!
We found a bug, which is causing the HTML to be invalid when highlighting - data-attributes are catched incorrectly by the regular expression.
We've got these "Search hit highlighting" settings:
In our theme we tend to use data-attributes without values to create bindings for JavaScript, like so:
<a
data-tile
data-tile-type="file"
>
<p data-tile-file-heading>...</p>
<p data-tile-file-filesize-and-ext>pdf</p>
</a>
Given the settings, if you go to the page with that HTML and add the highlight URL param at the end (?highlight=string
), this HTML will be converted to this form:
<a
data-tile
data-tile-type="file"
>
<p data-tile-file-heading>…</p>
<p data-tile-file-filesize-and-ext>pdf</p>
</a>
Which is of course invalid.
This is a problem with not so perfect regular expression found here:
relevanssi/lib/excerpts-highlights.php
Line 1569 in c72c448
This regular expression does better at job:
if ( preg_match_all( '/data-[\w-]+?="([^"]*?)"/sm', $content, $matches ) ) {}
Hi,
I've noticed a minor bug when search results return more than one page. On the bottom of the page there is only a "previous" link. When clicking the link it takes me to page 2. When on page 2 I can click on "next" which takes me back to page 1.
Also btw. is there an infinite scrolling option? Could not find one in settings.
@msaari can you tag the v3.6 release? I think it caused an issue and I would like to compare to see what happened.
When a synonym is found it is included in the search log.
reproduce:
I have found where to fix this, but I can see several approaches and there are probably more.
a) Don't log the synonym word
b) Do log the synonym but mark it in some way: round (synonym:circle)
c) Add a new column to the log table and log it seperatly.
d) ...?
I'm willing to work on this, but I'd like to discuss the best way to do this before starting.
Hello,
I ran into the following issue: When no search term the Tokenizer kills the search even when tax_query is provided.
Use case, search for specific terms from additional search fields to filter results based on taxonomy.
Seems like this line 387: /lib/search.php
should test $search_ok
before killing. Then on line 831: if ( $exact_match_bonus )
should change to if ( $exact_match_bonus && !empty($q) )
because $q
is empty.
Thanks!
@msaari I thought it would be good to mention this here in case people were looking for the issue and were able to fix it by disabling relevanssi.
It is a conflict with the Yoast Seo Plugin v5.0.2, a working temporary fix would be to downgrade to v5.0.1. The issue is already open @ Yoast/wordpress-seo#7480
Hey there!
I belong to an open source security research community, and a member (@geeknik) has found an issue, but doesn’t know the best way to disclose it.
If not a hassle, might you kindly add a SECURITY.md
file with an email, or another contact method? GitHub recommends this best practice to ensure security issues are responsibly disclosed, and it would serve as a simple instruction for security researchers in the future.
Thank you for your consideration, and I look forward to hearing from you!
(cc @huntr-helper)
If I type in the homepage address plus “/?posts_per_page=83” I am getting results and it is saving it in google search. If I turn off “Relevanssi” it does not display these results. I want it to NOT display results and also ignore any kind of indexing by google if I type in “myhomepageurl/?posts_per_page=83”. How do I disable this feature but keep the plugin at the same time?
Thanks
Notice: Undefined variable: content in /wp-content/plugins/relevanssi/lib/privacy.php on line 46
appears on admin pages if logging is disabled. There's a variable definition missing if logging is disabled.
It seems there may be a slight bug in Relevanssi since the incorporation of Oxygen Builder. When using ACF and Oxygen Builder, the custom fields for ACF are not indexed. You also are unable to change the custom fields drop-down in Relevansi to anything other than some (which looks to pre-populate the field with "ct_builder_shortcodes".
When you disable oxygen builder, you're able to change the fields ^^ and Relevanssi will index ACF fields, once you turn oxygen builder back on, the same actions happen (but if you don't rebuild your index it works.
Any chance to rectify this bug so that ACF and Oxygen builder can function and be indexed when both are in use?
With Relevanssi enabled, YITH Badge Management plugin do not return badge results. Yith support team advertise me that with Relevanssi disabled all works fine and that I should to contact you to resolve the problem (I'm uncertain if the problem is from YITH plugin or Relevanssi).
It seems like relevanssi_new_blog()
hasnt been added to either the wp_insert_site
or the wpmu_new_blog
actions and thus never runs on new blogs.
The Relevanssi (Premium) plugin includes inline scripts/styles that can cause issues with setting a Content Security Policy (CSP).
This issue proposes:
wp_print_inline_script_tag()
or wp_add_inline_style()
Examples:
inline <script> tag
Inline style tag in /premium/templates/relevanssi-related.php
I probably found a bug in search.php:1980
It seems that the relevanssi_post_date_throttle_join
function should add something to the join statement but instead replaces the join statement.
If I replace the assignment with an concatenating assignment operator our error is gone:
function relevanssi_post_date_throttle_join( $query_join ) {
if ( 'post_date' === get_option( 'relevanssi_default_orderby' ) &&
'on' === get_option( 'relevanssi_throttle', 'on' ) ) {
global $wpdb;
$query_join .= ', ' . $wpdb->posts . ' AS p';
}
return $query_join;
}
Recent changes in commit 76b87b7
potentially generates faulty SQL queries. Check changes in line 324 to 327.
I've discovered that if a previously deleted term contains a number in its slug, then the validation in line 321 still passes and the string slug is used as a numeric id, leading to a SQL statement where the term is parsed as a non-existing column.
What if is_numeric is to be replaced with ctype_digit instead? This should correctly determine if the slug only contains numeric characters.
Hi,
I've recently had an issue with the exact match boost setting not working properly in some cases, and after digging in the code I believe I found the issue.
Basically, to apply the exact match boost the search query is matched against the post title using stristr()
:
Lines 1433 to 1438 in 58391b5
However when using capital accented letters, they are not properly converted into lowercase by that function. For example, this will return false : stristr( 'CAFÉ', 'café' )
.
I believe using mb_stristr()
instead of stristr()
would fix this.
Hi, I have installed on this site being built here https://www.applifting.ml
But it doesn't bring up the portfolio types required ?
might be a bug with theme as other search also only brings up normal post types.
I think it is still using the default search not Relevanessi.
Is there a shortcode i can insert where the default search form currently is?
Thanks
I tried to wrap the cron indexing function in a small wp cli function and running into a php notice.
The $progress is not defined if not in pro version but stull checked in normal version?
Hello!
Before I say anything else, I want to thank you for publishing such an useful and well-documented plugin. I use Relevanssi free and premium for a number of clients. Thank you.
I've read your comments in a number of places stating that at a certain point Relevanssi should not handle very large data sets. I absolutely agree, but I am always interested in finding ways to increase that limit, if possible.
One of the biggest bottlenecks I've found is the database query: SELECT COUNT(DISTINCT(doc))
. (code reference)
I understand this is important to calculate weights and relevance, but it seems like this can be moderately accurate and still achieve similar results. Is that true?
If so, could this count be deferred to a scheduled task that updates the option? This would increase performance pretty significantly for all uses, but especially for sites with large indexes.
This kind of tax_query
does not work:
$args = array(
'tax_query' => array(
'relation' => 'OR',
array(
'taxonomy' => 'category',
'field' => 'term_id',
'terms' => array( 3, 36 ),
'operator' => 'AND',
),
array(
'taxonomy' => 'category',
'field' => 'term_id',
'terms' => array( 30, 36 ),
'operator' => 'AND',
),
),
's' => 'terms',
);
Ie. two AND queries joined together with an OR. The OR is ignored, and this becomes a query for posts that have taxonomies 3, 36, 30 and 36. The relevanssi_process_term_tax_ids()
doesn't handle this correctly, even though it gets reasonable data.
Raised at https://wordpress.org/support/topic/tax_query-relation-not-work/
The use of stripslashes
causes NULL bytes to appear in text. When this text is fed into preg_match
it causes warnings. Example:
Mar 14 16:59:10 ip-10-36-94-105 apache2[6306]: PHP Warning: preg_match(): Null byte in regex in /.../wp-content/plugins/relevanssi-premium/lib/excerpts-highlights.php on line 525
Mar 14 16:59:10 ip-10-36-94-105 apache2[6306]: PHP Warning: preg_match(): Null byte in regex in /.../wp-content/plugins/relevanssi-premium/lib/excerpts-highlights.php on line 529
I wasn't able to reproduce the warnings, but I was able to show that null bytes get added (in both relevanssi
and relevanssi-premium
). Steps to reproduce:
relevanssi
(or relevanssi-premium
)lib/common.php
: Add echo json_encode($string);
after this line: function relevanssi_tokenize( $string, $remove_stops = true, $min_word_length = -1 ) {
hello\\0world
: i.e. http://localhost/?s=hello%5C%5C0world
"hello\\0world""hello\u0000world"
will appear on the page. \u0000
is the JSON-encoded NULL byte. In some circumstances this string is passed to preg_match
, which causes warnings.This happens because the stripslashes
function not only strips slashes, it also replaces \0
with a NULL byte. This feature is undocumented, you can see it here in PHP's source: https://github.com/php/php-src/blob/php-7.2.3/ext/standard/string.c#L3616-L3651
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.