algolia / docsearch-configs Goto Github PK

View Code? Open in Web Editor NEW

455.0 71.0 1.1K 20.92 MB

DocSearch - Configurations

Home Page: https://docsearch.algolia.com/

License: MIT License

JavaScript 100.00%

algolia documentation search docsearch

docsearch-configs's Introduction

DocSearch configurations

DEPRECATED

This repository is not maintained anymore in favor of our new infrastructure.

All of the configs can now be edited directly from our web interface which also offer you a way to start new crawls.

If you have not joined your new application yet, please check your emails! :D

Summary

If you're looking for a way to add DocSearch to your site, the easiest solution is to apply to DocSearch. If you want to have a look at configurations to run your own scraper you're at the right place.

Options

Please check the dedicated documentation to have the list of all available options along with examples.

Useful links

docsearch-configs's People

Contributors

Stargazers

Watchers

Forkers

cadeuh tmeasday svendowideit mkigikm kaelig colinpoindexter-alation cdujeu rachelwhitton pierallard erlswtshrt thetutlage mesosphere-backup ngokevin bcremer guria woodyrew agallou mouton-rebelle jontyc nicolasgeraud vishr acorncom axilleas waruqi keybits robertsoniv achauve davewasmer wshao svx znck stevenbennitt atinux rmcneely karl-loginradius joelmarcey deadcoder0904 jdrew1303 qzhou1607-zz mostafa6765 codemonkey9000 ktoso ablewhiskey bahmutov onevcat sunnyzl drboolean chenglou tancy urbanairship asial-matagawa felixfbecker richterb kentcdodds jerrybao2017 mbctesting murilofrade chingjubb zzzzzb codysoyland mbrambilla improbable-io amistyuk-plana drstearns bradrydzewski m-turek benjaminclot minicacristi efarem kelunik acmetech phrawzty nidup ignasi35 wyattjoh vasilystrelyaev atlassianps fyrkant wpmetabox samclarke xuechunl myleslee lisunshiny tmentink muralisid cpojer daehli hramos shuyezi laurentpetard kotakanbe lottec yesmeck patrickdaze m-allanson glavin001 sagarkrsd sebastiandedeyne 2m kupriyanenko

docsearch-configs's Issues

Issue with hierarchy config?

Go to https://www.lightningdesignsystem.com
Search for "tiles"

Expected: first results include a reference to https://www.lightningdesignsystem.com/components/tiles/

Actual

Expected

Do you know why the actual "Tiles" component doesn't rank higher than other components, considering the h1 of the component page says "Tiles"?

Display search results content in the correct format

Do you want to request a feature or report a bug?

Feature request
index name: gitlab.json

What is the current behaviour?

The output of {{{_highlightResult.content.value}}} shows only text, as if the results were just a paragraph:

What is the expected behaviour?

Show the headers:

Display code in code blocks, like Algolia does for their docs:

What have you tried to solve it?

I don't know how to solve this, couldn't find anywhere - I'm probably searching for the wrong query.

@s-pace thoughts?

Why delete babeljs_cn ？

Hey, guys! I'm the maintainer of the babeljs.cn. I found that the docsearch of babeljs_cn was deleted, but I don 't know why. Please tell me why and how to solve it.
commit: e0cc0b8

Feather.js

http://docs.feathersjs.com/

Update Scalingo config to match new website

As reported in tweet, configuration for Scalingo need an update.

Free code camp

https://www.freecodecamp.com/wiki/en/

Composer

There is a lot of nice content, but everything is hard to find.

https://getcomposer.org/

GoLang

https://golang.org/

"Examples of great documentation"

https://dev.to/ben/what-are-some-examples-of-great-documentation/comments

Interesting list for potential DocSearch candidates.

Reindexing after domain changed

Hi,

Here is my config file: https://github.com/algolia/docsearch-configs/blob/master/configs/uniwebview.json

I've changed my domain from "unidocs.onevcat.com" to "docs.uniwebview.com" several days ago. However, the navigation destination of search result are still pointing to the old domain.

I guess it is due to the old index in using and the new ones are not valid yet.

I want to confirm is there anything like expiring duration for index? What is the reindex policy and is it possible to request a reindex immediately?

Thanks!

Add documentation on `page_rank`

GitKraken

https://support.gitkraken.com/

Suggested by @DirtyF

Update drone_io documentation

hey all, I was hoping for some help updating the indexing for the drone_io documentation. We have moved the documentation:

{
  "index_name": "drone_io",
  "start_urls": [
-   "http://readme.drone.io/"
+   "http://docs.drone.io/sitemap/"
  ],

We have also adjusted the structure. I was hoping that perhaps Algolia could crawl the documentation using the sitemap only (at http://docs.drone.io/sitemap/). This would allow us to make structural changes to the main documentation, without having to re-configure the crawling, since the sitemap structure would never change.

Do you think this would be possible?

I would offer to submit a pull request but was unsure if the below notation was correct, and I was having trouble setting up an environment to test myself (I will keep trying, though)

-    "lvl0": "header nav a.selected",
-    "lvl1": "main h1",
-    "lvl2": "main h2",
+    "lvl0": "body > ul > li > span",
+    "lvl1": "body > ul > li > ul > li > span",
+    "lvl2": "body > ul > li > ul > li > ul > li > a",

Silex

http://silex.sensiolabs.org/doc/master/

ExpressJS

http://expressjs.com/

stripe.com - relevance issue

Moved to private repo

public "metadata" in the docsearch configs?

It would be useful for tracking and reporting purposes to have some fields in the configs that aren't necessarily used by the scraper. Immediately I'd like to add:

name: Canonical, human-readable label (such as company or project name) for a given config.
human_url: A URL that a human could go to and get the documentation site.
I'm not married to those key names - suggestions welcome. 😄

Atom

https://atom.io/docs

Puppet (open source version)

The base reference manual: https://docs.puppet.com/puppet/latest

Note that Puppet Server, Hiera, and PuppetDB are split out into their own components (see https://docs.puppet.com/puppet/).

Segment

https://segment.com/docs

Select2

https://select2.github.io/

URLs not on sitemap are indexed

Do you want to request a feature or report a bug?

If it is a DocSearch index issue, what is the related `index_name` ?

index_name= pkgdown

What is the current behaviour?

Files that are not in the sitemap.xml are included in the index.

What is the expected behaviour?

Files that are not in the sitemap.xml should not be included in the index.

Summary

The pkgdown index includes the "Contributor Code of Conduct" page, which is not in the sitemap.xml.

To reproduce, go to the pkgdown website and search for "Contributor"; it's the first result.

The pkgdown config lists the sitemap.xml. Why is this page (and presumably other unwanted pages) included in the index?

From r-lib/pkgdown#626 (comment)

How to configure for <code> block

I am having a hard time figuring out what I need to configure so the scrapper can index the <code> block on my site.

This site is uses the same template as me, how would I index the code listed in it: https://myclabs.github.io/jquery.confirm/

Lodash

https://lodash.com/docs

Please crawl more pages for nim-lang

I love you for supporting nim-lang, but after trying the search I have to say: it's still not something that we can put on our website since the search results are too poor. For example, searching for 'split' doesn't contain a search result like http://nim-lang.org/docs/strutils.html#split,string,set[char],int

So lib.html is crawled, but none of actual Nim's stdlib. I would create a PR if only I knew how to do it.

Electron

http://electron.atom.io/docs/

Parse same website with 2 designs

Hi Algolia team !

I have a question about this configuration
https://github.com/algolia/docsearch-configs/blob/master/configs/akeneo.json

We will introduce a new design (same as https://api.akeneo.com/ ) for the v2.0 (the old design is still here : https://docs.akeneo.com/2.0/index.html)

So the parsing configuration between paths https://docs.akeneo.com/1.x/ and https://docs.akeneo.com/2.x/ will not be the same.

My question is: is it possible to have 2 configurations (one for 1.x and one for 2.x) ? If not, don't worry, we will bring back the new design for the previous 1.x paths.

Regars

Pierre
Akeneo

Webpack

http://webpack.github.io/docs/

VideoJS

http://docs.videojs.com/

BrunchJS

http://brunch.io/

Add `nb_hits_max` to documentation

It looks like a new parameter nb_hits_max has been introduced to the scraper. It would be great if information regarding this parameter is included in the documentation here 😄

Search not detecting lvl0 selector on some pages

For https://thumbprint.thumbtack.com our lvl0 selector looks like:

"lvl0": {
 "selector": "//*[@data-id='header__links']//a[@data-active='true']",
 "type": "xpath",
  "default_value": "Documentation"
},

https://github.com/algolia/docsearch-configs/blob/master/configs/thumbprint.json

When I search for the page title "Using Thumbprint in Sass" — https://thumbprint.thumbtack.com/guide/creating-pages/ — the search result correctly categorizes it under "Guide".

But if I search for the pages titles of the following pages:

It categorizes them under "Documentation" instead of "Guide".

In this screenshot "https://thumbprint.thumbtack.com/guide/utility-classes/" is among the results, note that it's categorized under "Documentation"

I've confirmed the lvl0 xpath works on those pages so am not sure what would have caused it to fail. Maybe your crawler searched cached pages that didn't have this selector available?

Vorlon.js

http://vorlonjs.com/documentation/

Leafletjs

http://leafletjs.com/reference.html

Heroku Dev Center

https://devcenter.heroku.com/

CI that runs on PRs of configs

If it's an existing config, run that one and suggest to update the nbHits in a comment. You could use Danger for this, since it's an easy way to run some things and comment on GitHub.

Another thing that could be checked is whether it's valid JSON, and if it's a valid docsearch config (by checking if necessary keys are present, and if they have the right values)

JQuery

http://api.jquery.com/

Ruby docs

Search seems currently broken on https://ruby-doc.org/core/

If only they could add Docsearch to their build pipleline 🙏

NodeJS

https://nodejs.org/en/docs/

ThreeJS

http://threejs.org/docs/index.html

MomentJS

http://momentjs.com/docs/

Indexing ellipsis

Do you want to request a feature or report a bug?

More feature than bug

If it is a DocSearch index issue, what is the related `index_name` ?

index_name= pkgdown

What is the current behaviour?

May R functions make use of ellipsis (...) to catch arguments in function calls. In a pkgdown documentation website these appear in the Argument list (see third argument on this page).

We capture most arguments and their values with the current docsearch pkgdown config, but ellipsis are not indexed because they are considered punctuation.

What is the expected behaviour?

I'd like ... to be included in search results as an argument value.

What have you tried to solve it?

I have tried including a period (.) in separatorsToIndex but the ellipsis are still not indexed.

Smooch.io

http://docs.smooch.io/ios/

algolia / docsearch-configs Goto Github PK

docsearch-configs's Introduction

DocSearch configurations

DEPRECATED

Summary

Options

Useful links

docsearch-configs's People

Contributors

Stargazers

Watchers

Forkers

docsearch-configs's Issues

Actual

Expected

Do you want to request a feature or report a bug?

What is the current behaviour?

What is the expected behaviour?

What have you tried to solve it?

Do you want to request a feature or report a bug?

If it is a DocSearch index issue, what is the related index_name ?

What is the current behaviour?

What is the expected behaviour?

Summary

Do you want to request a feature or report a bug?

If it is a DocSearch index issue, what is the related index_name ?

What is the current behaviour?

What is the expected behaviour?

What have you tried to solve it?

Recommend Projects

Recommend Topics

Recommend Org

If it is a DocSearch index issue, what is the related `index_name` ?

If it is a DocSearch index issue, what is the related `index_name` ?