Git Product home page Git Product logo

grimoirelab-elk's People

Contributors

alangsto avatar albertinisg avatar alpgarcia avatar animeshk08 avatar aswanipranjal avatar canasdiaz avatar dependabot[bot] avatar dicortazar avatar dpose avatar evamillan avatar georglink avatar imnitishng avatar inishchith avatar jgbarah avatar jjmerchante avatar kshitij3199 avatar lukaszgryglicki avatar mafesan avatar obaroikoh avatar propol avatar rafaeltheraven avatar rashmi-k-a avatar sduenas avatar shanchenqi avatar snack0verflow avatar valeriocos avatar vchrombie avatar vsevagen avatar willemjiang avatar zhquan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

grimoirelab-elk's Issues

NoneType error analyzing Maniphest

How to reproduce:

  • version: elasticgirl.23
  • sources: Bitergia's phabricator

Traceback:

2017-11-28 16:46:43,365 - grimoire_elk.arthur - ERROR - Traceback (most recent call last):
  File "./grimoire_elk/arthur.py", line 489, in enrich_backend
    enrich_count = enrich_items(ocean_backend, enrich_backend)
  File "./grimoire_elk/arthur.py", line 304, in enrich_items
    total= enrich_backend.enrich_items(ocean_backend)
  File "./grimoire_elk/elk/enrich.py", line 376, in enrich_items
    rich_item = self.get_rich_item(item)
  File "./grimoire_elk/elk/phabricator.py", line 254, in get_rich_item
    self.__fill_phab_ids(item['data'])
  File "./grimoire_elk/elk/phabricator.py", line 235, in __fill_phab_ids
    self.phab_ids_names[p['phid']] = p['name']
TypeError: 'NoneType' object is not subscriptable

Perceval in GrimoireELK

I followed instuctions here to install virtual environment for GrimoireELK: https://grimoirelab.gitbooks.io/training/content/grimoireelk/installation.html

When running perceval, I have this message:

Traceback (most recent call last):
  File "/home/assadm/venvs/grimoireelk/bin/perceval", line 82, in <module>
    """%(prog)s """  + perceval.__version__
AttributeError: module 'perceval' has no attribute '__version__'

I followed previous tutorial, https://grimoirelab.gitbooks.io/training/content/perceval/first_steps.html, and perceval works fine.

issue loading the data to ES using p2o.py

p2o.py --enrich --index git_raw --index-enrich git -e http://localhost:9200 --no_inc --debug git /tmp/git-data.json
2017-04-16 07:11:03,996 Debug mode activated
2017-04-16 07:11:03,998 Feeding Ocean from git (/tmp/git-data.json)
2017-04-16 07:11:04,553 Can't create index http://localhost:9200/git_raw (407)
2017-04-16 07:11:04,553 Error feeding ocean from git (/tmp/git-data.json):
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/grimoire_elk/arthur.py", line 67, in feed_backend
elastic_ocean = get_elastic(url, es_index, clean, ocean_backend)
File "/usr/local/lib/python3.4/dist-packages/grimoire_elk/utils.py", line 177, in get_elastic
elastic = ElasticSearch(url, es_index, mapping, clean, insecure, analyzers)
File "/usr/local/lib/python3.4/dist-packages/grimoire_elk/elk/elastic.py", line 83, in init
raise ElasticWriteException()
grimoire_elk.elk.elastic.ElasticWriteException
2017-04-16 07:11:04,557 Can't add repo to conf. Ocean elastic is not configured
2017-04-16 07:11:04,557 Done git
2017-04-16 07:11:04,557 Backed feed completed
fatal: Not a git repository (or any of the parent directories): .git
2017-04-16 07:11:04,561 Can't get the gelk version. /usr/local/lib/python3.4/dist-packages/grimoire_elk/elk/enrich.py
2017-04-16 07:11:05,097 Can't create index http://localhost:9200/git (407)
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/grimoire_elk/arthur.py", line 426, in enrich_backend
elastic_enrich = get_elastic(url, enrich_index, clean, enrich_backend)
File "/usr/local/lib/python3.4/dist-packages/grimoire_elk/utils.py", line 177, in get_elastic
elastic = ElasticSearch(url, es_index, mapping, clean, insecure, analyzers)
File "/usr/local/lib/python3.4/dist-packages/grimoire_elk/elk/elastic.py", line 83, in init
raise ElasticWriteException()
grimoire_elk.elk.elastic.ElasticWriteException
2017-04-16 07:11:05,100 Error enriching ocean from git (/tmp/git-data.json):
2017-04-16 07:11:05,100 Done git
2017-04-16 07:11:05,100 Enrich backend completed
2017-04-16 07:11:05,101 Finished in 0.02 min

Latest perceval seems to break grimoire_elk

When trying to use the master/HEAD for perceval, perceval-mozilla and perceval-opnfv, it seems grimoire_elk fails. For example, when running p2o.py:

2018-02-23 17:08:00,421 - grimoire_elk.arthur - ERROR - Error feeding ocean 'GitCommand' object has no attribute 'backend'
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/arthur.py", line 121, in feed_backend
    backend = backend_cmd.backend
AttributeError: 'GitCommand' object has no attribute 'backend'

I can patch grimoire_elk/arthur.py to get rid of this error:

a/grimoire_elk/arthur.py
+++ b/grimoire_elk/arthur.py
@@ -118,7 +118,7 @@ def feed_backend(url, clean, fetch_cache, backend_name, backend_params,
     try:
         backend_cmd = klass(*backend_params)
-        backend = backend_cmd.backend
+        backend = backend_cmd.BACKEND
         ocean_backend = connector[1](backend, fetch_cache=fetch_cache, project=project)
         logger.info("Feeding Ocean from %s (%s)", backend_name, backend.origin)

But I still get more errors:

p2o.py --enrich --index git_raw --index-enrich git -e https://admin:admin@localhost:9200 --no_inc --debug git https://github.com/grimoirelab/perceval.git
....
2018-02-23 23:27:45,554 Error feeding ocean from git (<property object at 0x7f188b7b6778>): fetch() missing 1 required positional argument: 'self'
Traceback (most recent call last):
  File "/home/jgb/src/jgbarah/mordred/grimoire_elk/arthur.py", line 197, in feed_backend
    ocean_backend.feed()
  File "/home/jgb/src/jgbarah/mordred/grimoire_elk/ocean/elastic.py", line 202, in feed
    items = self.perceval_backend.fetch()
TypeError: fetch() missing 1 required positional argument: 'self'
Traceback (most recent call last):
  File "/tmp/gl/bin/p2o.py", line 6, in <module>
    exec(compile(open(__file__).read(), __file__, 'exec'))
  File "/home/jgb/src/jgbarah/GrimoireELK/utils/p2o.py", line 73, in <module>
    args.arthur)
  File "/home/jgb/src/jgbarah/mordred/grimoire_elk/arthur.py", line 219, in feed_backend
    unique_id = es_index + "_" + backend.origin
TypeError: Can't convert 'property' object to str implicitly

Any idea?

Question: does slack deal differently with incrementals

I noticed that I was not getting all messages from the channels I load. What I noticed, watching the terminal as my script runs to load data, is that with every p2o command I use (for Git, GitHub and Jira), it always states "Incremental from: date" after the line that it is feeding Ocean.

Now for my Slack command I am using:

p2o.py --enrich --index slack_raw --index-enrich slack -e http://localhost:9200 slack channelid -t myslacktoken

it appears that this line is missing, it does not state the incremental date.

If I then change my command to:

p2o.py --enrich --index slack_raw --index-enrich slack -e http://localhost:9200 slack channelid --from-date 2017-06-01 -t myslacktoken

it does show an incremental date, which of course matches the 2017-06-01.

This leads me to think that the Slack script handles the incremental differently, or perhaps wrong? If I add the --from-date it does seem to make sure all messages get loaded.

Removal of the code that generates the "conf" index

Desirement Source:
mozilla/participation-metrics-org#157

Short Description:
A vulnerability assessment conducted by Mozilla's Enterprise Information Security team revealed, that

https://analytics.mozilla.community/elasticsearch/conf/_search leaks some configuration data eg:
"git command - fatal: could not create leading directories of '/home/bitergia/.perceval/repositories/https://github.com/MozillaFoundation/Design.git-git'\n"

Potential Solution:
According to @acs:

This data is not used anymore. Is legacy stuff that must be removed.

/cc @gdestuynder @sanacl

Jira, incremental load not working?

If a run that command, to load Jira issues twice.... first for project EZS, and then with the same command but for project EZP, it should add all Jira issues in the same indexes right?

p2o.py --enrich --index jira_raw --index-enrich jira -e http://localhost:9200 --debug jira https://jira.ez.no --project EZS

and

p2o.py --enrich --index jira_raw --index-enrich jira -e http://localhost:9200 --debug jira https://jira.ez.no --project EZP

For some reason, it will only load data on the first command. The second run, it says it has not found any issues. Could it be due to an incorrect date somewhere, as in the index is marked somehow for incremental import?

I had this in my terminal:

(grimoirelab) gelk@grim:~$ p2o.py --enrich --index jira_raw --index-enrich jira -e http://localhost:9200 --no_inc --debug jira https://jira.ez.no --project EZP -u ****** -p '******'
2017-03-02 20:22:59,804 Debug mode activated
2017-03-02 20:23:18,942 Feeding Ocean from jira (https://jira.ez.no)
2017-03-02 20:23:19,253 http://localhost:9200/jira_raw/_search
{ "size": 0,
"query" : {
"term" : { "tag" : "https://jira.ez.no" }
},

        "aggs": {
            "1": {
              "max": {
                "field": "metadata__updated_on"
              }
            }
        }
   
    }

2017-03-02 20:23:19,316 Incremental from: 2017-03-02 16:51:41+00:00
2017-03-02 20:23:19,317 Looking for issues at site 'https://jira.ez.no', in project 'EZP' and updated from '2017-03-02 16:51:41'
2017-03-02 20:23:20,499 No issues were found.

Am I doing something wrong? I also tried by adding the --no_inc as parameter on both commands.

'GitHubCommand' object has no attribute 'offset'

Hi,

I'm trying to import a simple project with your tool to see some git/github statistics.
I was following your guide at https://github.com/grimoirelab/use_cases/tree/master/documentfoundation

I can't import a simple git repo form github into elasticsearch because of the index creation. Something is not working out.
Do you have some experience with this issue or can you hint me to the place where you are creating the index in your code?

bitergia@1cd265275072:~/GrimoireELK/utils$ ./p2o.py -e http://elasticsearch:9200 --index git_TDF -g git https://github.com/mlem/bankaustria-trend-analyzer.git --from-date "2014-01-01"
2016-07-21 08:32:17,507 Debug mode activated
2016-07-21 08:32:17,513 Feeding Ocean from git (https://github.com/mlem/bankaustria-trend-analyzer.git)
2016-07-21 08:32:17,714 Created index http://elasticsearch:9200/git_tdf
2016-07-21 08:32:17,851 Error creating ES mappings {"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"malformed mapping no root object found"}],"type":"mapper_parsing_exception","reason":"malformed mapping no root object found"},"status":400}
2016-07-21 08:32:17,940 Error feeding ocean from git (https://github.com/mlem/bankaustria-trend-analyzer.git): 'GitCommand' object has no attribute 'offset'
2016-07-21 08:32:17,940 Adding repo to Ocean http://elasticsearch:9200/conf/repos/git_tdf_https:__github.com_mlem_bankaustria-trend-analyzer.git {'backend_params': ['https://github.com/mlem/bankaustria-trend-analyzer.git', '--from-date', '2014-01-01'], 'error': "'GitCommand' object has no attribute 'offset'", 'repo_update_start': '2016-07-21T08:32:17.940200', 'index_enrich': None, 'repo_update': '2016-07-21T08:32:17.940496', 'backend_name': 'git', 'success': False, 'index': 'git_TDF', 'project': None}
2016-07-21 08:32:17,985 Done git 
2016-07-21 08:32:17,989 Queued feed_backend job
2016-07-21 08:32:17,990 <Job 80587e61-314b-46fa-bff4-d8f44885d4a1: grimoire.arthur.feed_backend('http://elasticsearch:9200', False, False, 'git', ['https://github.com/mlem/bankaustria-trend-analyzer.git', '--from-date', '2014-01-01', '-t', '24df6490c0d8b4e7d474c492fe9872829770e5e2'], 'git_TDF', None, None)>
2016-07-21 08:32:17,990 Finished in 0.01 min

Error on elasticsearch_1:

elasticsearch_1 | [2016-07-21 08:24:33,187][DEBUG][action.admin.indices.mapping.put] [Korath the Pursuer] failed to put mappings on indices [[git_tdf]], type [items]
elasticsearch_1 | MapperParsingException[malformed mapping no root object found]
elasticsearch_1 |   at org.elasticsearch.index.mapper.DocumentMapperParser.extractMapping(DocumentMapperParser.java:208)
elasticsearch_1 |   at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:93)
elasticsearch_1 |   at org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:435)
elasticsearch_1 |   at org.elasticsearch.cluster.metadata.MetaDataMappingService$PutMappingExecutor.applyRequest(MetaDataMappingService.java:257)
elasticsearch_1 |   at org.elasticsearch.cluster.metadata.MetaDataMappingService$PutMappingExecutor.execute(MetaDataMappingService.java:230)
elasticsearch_1 |   at org.elasticsearch.cluster.service.InternalClusterService.runTasksForExecutor(InternalClusterService.java:458)
elasticsearch_1 |   at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:762)
elasticsearch_1 |   at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:231)
elasticsearch_1 |   at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:194)
elasticsearch_1 |   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
elasticsearch_1 |   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
elasticsearch_1 |   at java.lang.Thread.run(Thread.java:745)

from elastic search server the reponse to conf/repos

{
  "took": 11,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
      {
        "_index": "conf",
        "_type": "repos",
        "_id": "git_tdf_https:__github.com_mlem_bankaustria-trend-analyzer",
        "_score": 1,
        "_source": {
          "error": "'GitHubCommand' object has no attribute 'offset'",
          "success": false,
          "repo_update_start": "2016-07-20T15:24:04.965358",
          "backend_params": [
            "--owner",
            "mlem",
            "--repository",
            "bankaustria-trend-analyzer",
            "--from-date",
            "2014-01-01"
          ],
          "index_enrich": null,
          "index": "git_tdf",
          "backend_name": "github",
          "project": null,
          "repo_update": "2016-07-20T15:24:04.965715"
        }
      },
      {
        "_index": "conf",
        "_type": "repos",
        "_id": "git_tdf_https:__github.com_mlem_bankaustria-trend-analyzer.git",
        "_score": 1,
        "_source": {
          "backend_params": [
            "https://github.com/mlem/bankaustria-trend-analyzer.git",
            "--from-date",
            "2014-01-01"
          ],
          "error": "'GitCommand' object has no attribute 'offset'",
          "repo_update_start": "2016-07-21T08:32:17.940200",
          "index_enrich": null,
          "repo_update": "2016-07-21T08:32:17.940496",
          "backend_name": "git",
          "success": false,
          "index": "git_TDF",
          "project": null
        }
      }
    ]
  }
}

elasticsearch5.4 The index file name displays an exception

Error message๏ผš
[2017-08-03T10:49:40405][INFO][o.e.c.m.MetaDataCreateIndexService] [ES-node] [logstash-tt-2017.08.03] creating index cause [auto (bulk API)], templates [logstash], shards [5]/[1], mappings [default]

[2017-08-03T10:49:40936][INFO,][o.e.c.m.MetaDataMappingService], [ES-node], [logstash-tt-2017.08.03/AL0czI5qSKWb3CIVzM_oXw], create_mapping, [system]

Under the elasticsearch index save directory /data, the index file is shown as: AL0czI5qSKWb3CIVzM_oXw (unknown file name)

The actual index attribute should be in logstash from output to elasticsearch: logstash-tt-2017.08.03

How can I show normal here?

[enrich] Getting the following error with Meetup items

Using elasticgirl.19 release, I'm getting the following error:

2017-11-08 10:43:41,105 - grimoire_elk.arthur - ERROR - Traceback (most recent call last):
  File "./grimoire_elk/arthur.py", line 486, in enrich_backend
    enrich_count = enrich_items(ocean_backend, enrich_backend)
  File "./grimoire_elk/arthur.py", line 301, in enrich_items
    total= enrich_backend.enrich_items(ocean_backend)
  File "./grimoire_elk/elk/meetup.py", line 326, in enrich_items
    super(MeetupEnrich, self).enrich_items(ocean_backend)
  File "./grimoire_elk/elk/enrich.py", line 376, in enrich_items
    rich_item = self.get_rich_item(item)
  File "./grimoire_elk/elk/enrich.py", line 80, in decorator
    eitem = func(self, *args, **kwargs)
  File "./grimoire_elk/elk/meetup.py", line 168, in get_rich_item
    eitem['time_date'] = unixtime_to_datetime(event['time']/1000).isoformat()
KeyError: 'time'

p2o.py github backend: issue with reactions

When fetching issues from github I have the follwing issue:
Error feeding ocean from github (https://github.com/elastic/logstash): 'reactions'

I am using latest version of GrimoireLab, elasticgirl 18.1

Here is complete log from terminal:

(grimoireelk) assadm@assadm-ThinkPad-T470p:~$ p2o.py --enrich --index github_raw --index-enrich github -e http://localhost:9200 --no_inc --debug   github elastic logstash --from-date '2017-01-01' --sleep-for-rate
2017-10-25 14:13:58,035 Debug mode activated
2017-10-25 14:13:58,041 Feeding Ocean from github (https://github.com/elastic/logstash)
2017-10-25 14:13:58,126 Incremental from: 2017-01-01 00:00:00+00:00
2017-10-25 14:13:58,127 Get GitHub paginated items from https://api.github.com/repos/elastic/logstash/issues
2017-10-25 14:13:59,081 Rate limit: 59
2017-10-25 14:13:59,083 Page: 1/89
2017-10-25 14:13:59,246 Getting info for https://api.github.com/users/thmoeller
2017-10-25 14:13:59,651 Rate limit: 58
2017-10-25 14:14:00,036 Rate limit: 57
2017-10-25 14:14:00,152 Get GitHub paginated items from https://api.github.com/repos/elastic/logstash/issues/5116/comments
2017-10-25 14:14:00,614 Rate limit: 56
2017-10-25 14:14:00,660 Getting info for https://api.github.com/users/ph
2017-10-25 14:14:01,049 Rate limit: 55
2017-10-25 14:14:01,507 Rate limit: 54
2017-10-25 14:14:01,585 Error feeding ocean from github (https://github.com/elastic/logstash): 'reactions'
Traceback (most recent call last):
  File "/home/assadm/venvs/grimoireelk/lib/python3.5/site-packages/grimoire_elk/arthur.py", line 126, in feed_backend
    ocean_backend.feed(from_date)
  File "/home/assadm/venvs/grimoireelk/lib/python3.5/site-packages/grimoire_elk/ocean/elastic.py", line 201, in feed
    for item in items:
  File "/home/assadm/venvs/grimoireelk/lib/python3.5/site-packages/perceval/backend.py", line 360, in decorator
    for data in func(self, *args, **kwargs):
  File "/home/assadm/venvs/grimoireelk/lib/python3.5/site-packages/perceval/backends/core/github.py", line 175, in fetch
    issue[field + '_data'] = self.__get_issue_comments(issue['number'])
  File "/home/assadm/venvs/grimoireelk/lib/python3.5/site-packages/perceval/backends/core/github.py", line 280, in __get_issue_comments
    self.__get_issue_comment_reactions(comment_id, comment['reactions']['total_count'])
KeyError: 'reactions'
2017-10-25 14:14:01,588 Adding repo to Ocean http://localhost:9200/conf/repos/github_raw_https:__github.com_elastic_logstash {'index': 'github_raw', 'success': False, 'repo_update': '2017-10-25T14:14:01.588240', 'backend_params': ['elastic', 'logstash', '--from-date', '2017-01-01', '--sleep-for-rate'], 'project': None, 'repo_update_start': '2017-10-25T14:13:58.125662', 'error': "'reactions'", 'index_enrich': 'github', 'backend_name': 'github'}
2017-10-25 14:14:01,669 Done github 
2017-10-25 14:14:01,669 Backed feed completed
fatal: Not a git repository (or any of the parent directories): .git
2017-10-25 14:14:01,699 Can't get the gelk version. /home/assadm/venvs/grimoireelk/lib/python3.5/site-packages/grimoire_elk/elk/enrich.py
2017-10-25 14:14:01,739 Last enrichment: None
2017-10-25 14:14:01,749 Adding enrichment data to http://localhost:9200/github
2017-10-25 14:14:01,750 Adding items to http://localhost:9200/github/items/_bulk (in 1000 packs)
2017-10-25 14:14:01,750 Creating a elastic items generator.
2017-10-25 14:14:01,750 http://localhost:9200/github_raw/_search?scroll=10m&size=100
{
    "query": {
        "bool": {
            "must": [
                {
                    "term": {
                        "origin": "https://github.com/elastic/logstash"
                    }
                }
            ]
        }
    },
    "sort": {
        "metadata__timestamp": {
            "order": "asc"
        }
    }
}
2017-10-25 14:14:01,767 No results found from http://localhost:9200/github_raw
2017-10-25 14:14:01,767 Updating GitHub users geolocations in Elastic
2017-10-25 14:14:01,767 Adding geoloc to http://localhost:9200/github/geolocations/_bulk (in 1000 packs)
2017-10-25 14:14:01,769 Adding geoloc to ES Done
2017-10-25 14:14:01,769 Total items enriched 0 
2017-10-25 14:14:01,769 Done github 
2017-10-25 14:14:01,770 Enrich backend completed
2017-10-25 14:14:01,770 Finished in 0.06 min

fatal: Not a git repository

As reported by e-mail already, I had one harmless error report. I am running the following command:
p2o.py --enrich --index jira_raw --index-enrich jira -e http://localhost:9200 --no_inc --debug jira https://jira.ez.no --project EZP -u my-username -p โ€˜my-passwordโ€™

I am currently on:
perceval 0.5.0
grimoire-elk 0.22.1
grimoire-kidash 0.22.1
Python 3.4

The error report:

2017-03-02 17:48:40,097 Adding repo to Ocean http://localhost:9200/conf/repos/jira_raw_https:__jira.ez.no {'project': None, 'backend_params': ['https://jira.ez.no', '--project', 'EZS', '-u', '', '-p', '', '--from-date', '2017-02-28'], 'index': 'jira_raw', 'index_enrich': 'jira', 'success': True, 'repo_update': '2017-03-02T17:48:40.097873', 'backend_name': 'jira', 'repo_update_start': '2017-03-02T17:48:14.893793'}
2017-03-02 17:48:40,106 Done jira
2017-03-02 17:48:40,106 Backed feed completed
fatal: Not a git repository (or any parent up to mount point /home/gelk)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
2017-03-02 17:48:58,348 Can't get the gelk version. /home/gelk/venvs/grimoirelab/lib/python3.4/site-packages/grimoire_elk/elk/enrich.py
2017-03-02 17:48:58,480 Created index http://localhost:9200/jira
2017-03-02 17:48:58,493 Last enrichment: 2017-02-28 00:00:00+00:00
2017-03-02 17:48:58,511 Adding enrichment data to http://localhost:9200/jira
2017-03-02 17:48:58,511 Adding items to http://localhost:9200/jira/items/_bulk (in 100 packs)
2017-03-02 17:48:58,512 http://localhost:9200/jira_raw/_search?scroll=10m&size=10
{
"query": {
"bool": {
"must": [
{"term":
{ "tag" : "https://jira.ez.no" }
}

                , {"range":
                    {"metadata__updated_on": {"gte": "2017-02-28T00:00:00+00:00"}}
                }
            ]
                }
            } , "sort": { "metadata__timestamp": { "order": "asc" }} 
        }

2017-03-02 17:48:58,793 Total items enriched 27
2017-03-02 17:48:58,793 Done jira
2017-03-02 17:48:58,794 Enrich backend completed
2017-03-02 17:48:58,794 Finished in 1.05 min
(grimoirelab) gelk@grim:~$

Your reply (Alvaro del Castillo) to this by e-mail:

This is how gelk try to get its version when it is run inside a git
clone version. It is a pre-pypi feature that must be changed. But it is
harmless.

Incorrect FSF address

As pointed out by rpmlint during packaging, all the FSF addresses in the license headers in all files are outdated.

Add Archive mode

Perceval 0.9.11 adds Archive mode, removing the Cache one. Grimoire ELK still uses the Cache, so newer versions of Perceval cannot be used.

The purpose of this issue is to integrate Archive in GELK and remove the Cache. Cache never worked as expected so it's safe to remove it.

JSON serializing crash

Hello,

I'm using the fosdem16 branch of Perceval as outlined in the first use case. While trying to use GrimoireELK to bring data from the Linux Kernel git log into elasticsearch I ran into a single commit that was causing Perceval to crash.

Here is the beginning of the error log:
Error feeding ocean from git (linux.gitlog): 'utf-8' codec can't decode byte 0xf6 in position 5329: invalid start byte Traceback (most recent call last): File "GrimoireELK/utils/grimoire/arthur.py", line 76, in feed_backend ocean_backend.feed() File "GrimoireELK/utils/grimoire/ocean/elastic.py", line 111, in feed for item in items: File "gitlab/lib/python3.4/site-packages/perceval/backend.py", line 161, in decorator for item in func(self, *args, **kwargs): File "gitlab/lib/python3.4/site-packages/perceval/backends/git.py", line 73, in fetch commits = [commit for commit in self.parse_git_log(self.gitlog)] File "gitlab/lib/python3.4/site-packages/perceval/backends/git.py", line 73, in <listcomp> commits = [commit for commit in self.parse_git_log(self.gitlog)] File "gitlab/lib/python3.4/site-packages/perceval/backends/git.py", line 102, in parse_git_log for commit in parser.parse(): File "gitlab/lib/python3.4/site-packages/perceval/backends/git.py", line 277, in parse for line in self.stream: File "gitlab/lib/python3.4/codecs.py", line 319, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf6 in position 5329: invalid start byte

I identified the culprit to be this commit.

The line that causes it is this: "From "Uwe Kleine-K๏ฟฝnig" [email protected]:"

Deleting the ๏ฟฝ (which should be an o umlaut) resolves the problem.

pip3 install/upgrade does not fetch latest versions

I installed the grimoire-elk and grimoire-kidash python packages earlier (through pip package manager). This also pulled in the perceval package.

At that time I was on versions: grimoire-elk 0.22.1, grimoire-kidash 0.22.1, and perceval 0.5.0.

I wanted to update the packages so I could use the Slack backend which is now available in perceval 0.7.0. So I ran the following commands:

pip3 install grimoire-elk --upgrade
pip3 install grimoire-kidash --upgrade

The first command showed me some upgrade messages in the terminal, no warning/errors. The second command resulted in 'already up to date' messages in the terminal.

After, I am on versions; grimoire-elk 0.22.1, grimoire-kidash 0.22.1, and perceval 0.6.0.

Looks like grimoire-elk and grimoire-kidash where not updated. I was expecting the latest version/release 0.26 as available on github here.

Perceval did get upgraded to 0.6.0, but not to 0.7.0.

The question is, if this is due to something in pip package configs? Or do I need to upgrade in a different way?

Question: how to use Slack backend and p2o

I got Slack working, or at least partially. I run the following command:

p2o.py --enrich --index slack_raw --index-enrich slack -e http://localhost:9200 --debug slack C06BF20AH --from-date 2015-06-01 -t mySlackToken

What this does, is that it loads 100 messages from the Slack channel, but from today and backwards. It does not start at the given date 2015-06-01 so I can't seem to load the full history of that channel.

Am I doing something wrong?

And additionally, if I want to load more channels of that Slack team into the index, do I run separate commands and do I need to use --project at all?

Not getting Kibana panels "title"

I have an ES 5.5.2 with 4 dashboards. I am trying to get them using panels.py module, but I only get the ids, not the title info

In[1]: from grimoire_elk.panels import import_dashboard, export_dashboard, list_dashboards
In[2]: list_dashboards('http://localhost:9200')
http://localhost:9200/.kibana/dashboard/_search?size=10000
8d178c90-8be2-11e7-a33b-aba58e02c4ed
af7293f0-915c-11e7-a4b2-6f5b77503826
e551e3e0-b3f8-11e7-ac51-73a5a3d76528
cfc1b960-be20-11e7-8893-49a3b38999ba

If I run the query by hand (http://localhost:9200/.kibana/dashboard/_search?size=10000), I get this:

{"took":9,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":4,"max_score":1.0,"hits":[{"_index":".kibana","_type":"dashboard","_id":"8d178c90-8be2-11e7-a33b-aba58e02c4ed","_score":1.0,"_source":{"title":"CONSUL Dashboard","hits":0,"description":"","panelsJSON":"[{\"col\":1,\"id\":\"29f94040-8be2-11e7-a33b-aba58e02c4ed\",\"panelIndex\":1,\"row\":5,\"size_x\":6,\"size_y\":6,\"type\":\"visualization\"},{\"col\":1,\"id\":\"6bd54c20-8be2-11e7-a33b-aba58e02c4ed\",\"panelIndex\":2,\"row\":1,\"size_x\":3,\"size_y\":4,\"type\":\"visualization\"},{\"col\":4,\"id\":\"a16c6ef0-8be1-11e7-a33b-aba58e02c4ed\",\"panelIndex\":3,\"row\":3,\"size_x\":3,\"size_y\":2,\"type\":\"visualization\"},{\"col\":7,\"id\":\"847d8310-8be1-11e7-a33b-aba58e02c4ed\",\"panelIndex\":4,\"row\":4,\"size_x\":6,\"size_y\":7,\"type\":\"visualization\"},{\"col\":4,\"id\":\"7de42430-8be3-11e7-a33b-aba58e02c4ed\",\"panelIndex\":5,\"row\":1,\"size_x\":3,\"size_y\":2,\"type\":\"visualization\"},{\"size_x\":6,\"size_y\":3,\"panelIndex\":6,\"type\":\"visualization\",\"id\":\"48deba10-8be4-11e7-a33b-aba58e02c4ed\",\"col\":7,\"row\":1}]","optionsJSON":"{\"darkTheme\":false}","uiStateJSON":"{\"P-1\":{\"vis\":{\"params\":{\"sort\":{\"columnIndex\":null,\"direction\":null}}}},\"P-3\":{\"vis\":{\"legendOpen\":false}},\"P-5\":{\"vis\":{\"legendOpen\":false}},\"P-6\":{\"vis\":{\"legendOpen\":false}}}","version":1,"timeRestore":false,"kibanaSavedObjectMeta":{"searchSourceJSON":"{\"filter\":[{\"query\":{\"match_all\":{}}}],\"highlightAll\":true,\"version\":true}"}}},{"_index":".kibana","_type":"dashboard","_id":"af7293f0-915c-11e7-a4b2-6f5b77503826","_score":1.0,"_source":{"title":"CTT Gob Es Dashboard","hits":0,"description":"","panelsJSON":"[{\"col\":5,\"id\":\"47864c30-9159-11e7-a4b2-6f5b77503826\",\"panelIndex\":1,\"row\":1,\"size_x\":4,\"size_y\":2,\"type\":\"visualization\"},{\"col\":1,\"id\":\"75cb15d0-9159-11e7-a4b2-6f5b77503826\",\"panelIndex\":2,\"row\":1,\"size_x\":4,\"size_y\":2,\"type\":\"visualization\"},{\"col\":1,\"id\":\"d45a8b30-9159-11e7-a4b2-6f5b77503826\",\"panelIndex\":3,\"row\":3,\"size_x\":4,\"size_y\":3,\"type\":\"visualization\"},{\"col\":5,\"id\":\"ab7b7b10-915a-11e7-a4b2-6f5b77503826\",\"panelIndex\":4,\"row\":3,\"size_x\":8,\"size_y\":6,\"type\":\"visualization\"},{\"col\":1,\"id\":\"994ef6f0-915b-11e7-a4b2-6f5b77503826\",\"panelIndex\":5,\"row\":9,\"size_x\":8,\"size_y\":5,\"type\":\"visualization\"},{\"col\":1,\"id\":\"e4cc1c10-915c-11e7-a4b2-6f5b77503826\",\"panelIndex\":6,\"row\":6,\"size_x\":4,\"size_y\":3,\"type\":\"visualization\"},{\"col\":9,\"id\":\"af0cb0c0-915d-11e7-a4b2-6f5b77503826\",\"panelIndex\":7,\"row\":1,\"size_x\":4,\"size_y\":2,\"type\":\"visualization\"},{\"col\":9,\"id\":\"2f2ec820-916c-11e7-bd8b-67e991378395\",\"panelIndex\":8,\"row\":9,\"size_x\":4,\"size_y\":5,\"type\":\"visualization\"}]","optionsJSON":"{\"darkTheme\":false}","uiStateJSON":"{\"P-1\":{\"vis\":{\"legendOpen\":false}},\"P-2\":{\"vis\":{\"legendOpen\":false}},\"P-4\":{\"vis\":{\"params\":{\"sort\":{\"columnIndex\":null,\"direction\":null}}}},\"P-5\":{\"vis\":{\"params\":{\"sort\":{\"columnIndex\":1,\"direction\":\"desc\"}}}}}","version":1,"timeRestore":true,"timeTo":"now","timeFrom":"now-7y","refreshInterval":{"display":"Off","pause":false,"value":0},"kibanaSavedObjectMeta":{"searchSourceJSON":"{\"filter\":[{\"query\":{\"match_all\":{}}}],\"highlightAll\":true,\"version\":true}"}}},{"_index":".kibana","_type":"dashboard","_id":"e551e3e0-b3f8-11e7-ac51-73a5a3d76528","_score":1.0,"_source":{"title":"Liferay Portal Git","hits":0,"description":"","panelsJSON":"[{\"size_x\":7,\"size_y\":4,\"panelIndex\":1,\"type\":\"visualization\",\"id\":\"f26476d0-b3f1-11e7-ac51-73a5a3d76528\",\"col\":1,\"row\":1},{\"size_x\":5,\"size_y\":3,\"panelIndex\":2,\"type\":\"visualization\",\"id\":\"3f9f46f0-b3f2-11e7-ac51-73a5a3d76528\",\"col\":8,\"row\":1},{\"size_x\":9,\"size_y\":7,\"panelIndex\":3,\"type\":\"visualization\",\"id\":\"cc4ee2f0-b41e-11e7-9d9d-b7f24dba3ac4\",\"col\":1,\"row\":5},{\"size_x\":3,\"size_y\":3,\"panelIndex\":4,\"type\":\"visualization\",\"id\":\"c5ae69e0-b3f9-11e7-ac51-73a5a3d76528\",\"col\":10,\"row\":4}]","optionsJSON":"{\"darkTheme\":false}","uiStateJSON":"{\"P-1\":{\"vis\":{\"defaultColors\":{\"0 - 2,000\":\"rgb(247,252,245)\",\"2,000 - 4,000\":\"rgb(199,233,192)\",\"4,000 - 6,000\":\"rgb(116,196,118)\",\"6,000 - 8,000\":\"rgb(35,139,69)\"}}},\"P-3\":{\"vis\":{\"params\":{\"sort\":{\"columnIndex\":null,\"direction\":null}}}}}","version":1,"timeRestore":false,"kibanaSavedObjectMeta":{"searchSourceJSON":"{\"filter\":[{\"meta\":{\"index\":\"liferay-portal-git\",\"negate\":true,\"type\":\"phrase\",\"key\":\"contributor_name\",\"value\":\"Liferay Continuous Integration\",\"disabled\":false,\"alias\":null},\"query\":{\"match\":{\"contributor_name\":{\"query\":\"Liferay Continuous Integration\",\"type\":\"phrase\"}}},\"$state\":{\"store\":\"appState\"}},{\"meta\":{\"index\":\"liferay-portal-git\",\"type\":\"phrase\",\"key\":\"contributor_name\",\"value\":\"Julio Camarero\",\"disabled\":true,\"negate\":false,\"alias\":null},\"query\":{\"match\":{\"contributor_name\":{\"query\":\"Julio Camarero\",\"type\":\"phrase\"}}},\"$state\":{\"store\":\"appState\"}},{\"meta\":{\"index\":\"liferay-portal-git\",\"type\":\"phrase\",\"key\":\"contributor_name\",\"value\":\"Brian Chan\",\"disabled\":true,\"negate\":false,\"alias\":null},\"query\":{\"match\":{\"contributor_name\":{\"query\":\"Brian Chan\",\"type\":\"phrase\"}}},\"$state\":{\"store\":\"appState\"}},{\"query\":{\"match_all\":{}}}],\"highlightAll\":true,\"version\":true}"}}},{"_index":".kibana","_type":"dashboard","_id":"cfc1b960-be20-11e7-8893-49a3b38999ba","_score":1.0,"_source":{"title":"SWO2","hits":0,"description":"","panelsJSON":"[{\"col\":9,\"id\":\"bd823460-be1f-11e7-8893-49a3b38999ba\",\"panelIndex\":1,\"row\":3,\"size_x\":4,\"size_y\":3,\"type\":\"visualization\"},{\"col\":1,\"id\":\"0d884ef0-be1f-11e7-8ca0-5d986dc727a8\",\"panelIndex\":2,\"row\":1,\"size_x\":12,\"size_y\":2,\"type\":\"visualization\"},{\"col\":5,\"id\":\"dc1d5c90-be12-11e7-8ca0-5d986dc727a8\",\"panelIndex\":3,\"row\":3,\"size_x\":4,\"size_y\":3,\"type\":\"visualization\"},{\"col\":1,\"id\":\"8124d310-bee2-11e7-b827-1d0b8b37ec34\",\"panelIndex\":4,\"row\":3,\"size_x\":4,\"size_y\":3,\"type\":\"visualization\"},{\"col\":1,\"id\":\"60358a10-bee1-11e7-b827-1d0b8b37ec34\",\"panelIndex\":5,\"row\":8,\"size_x\":4,\"size_y\":3,\"type\":\"visualization\"},{\"col\":5,\"id\":\"c4d8a1a0-bee1-11e7-b827-1d0b8b37ec34\",\"panelIndex\":6,\"row\":8,\"size_x\":4,\"size_y\":3,\"type\":\"visualization\"},{\"size_x\":12,\"size_y\":5,\"panelIndex\":7,\"type\":\"visualization\",\"id\":\"eb33b740-bee6-11e7-b827-1d0b8b37ec34\",\"col\":1,\"row\":11},{\"size_x\":12,\"size_y\":2,\"panelIndex\":8,\"type\":\"visualization\",\"id\":\"5c4cafc0-bee4-11e7-b827-1d0b8b37ec34\",\"col\":1,\"row\":6},{\"size_x\":4,\"size_y\":3,\"panelIndex\":9,\"type\":\"visualization\",\"id\":\"e8578440-bee4-11e7-b827-1d0b8b37ec34\",\"col\":9,\"row\":8}]","optionsJSON":"{\"darkTheme\":false}","uiStateJSON":"{\"P-2\":{\"vis\":{\"defaultColors\":{\"0 - 100\":\"rgb(0,104,55)\"}}},\"P-7\":{\"vis\":{\"params\":{\"sort\":{\"columnIndex\":null,\"direction\":null}}}},\"P-8\":{\"vis\":{\"defaultColors\":{\"0 - 100\":\"rgb(0,104,55)\"}}},\"P-9\":{\"vis\":{\"defaultColors\":{\"0 - 750\":\"rgb(247,252,245)\",\"750 - 1,500\":\"rgb(199,233,192)\",\"1,500 - 2,250\":\"rgb(116,196,118)\",\"2,250 - 3,000\":\"rgb(35,139,69)\"},\"legendOpen\":false}},\"P-6\":{\"vis\":{\"legendOpen\":false}}}","version":1,"timeRestore":true,"timeTo":"now","timeFrom":"now-10y","refreshInterval":{"display":"Off","pause":false,"value":0},"kibanaSavedObjectMeta":{"searchSourceJSON":"{\"filter\":[{\"query\":{\"match_all\":{}}}],\"highlightAll\":true,\"version\":true}"}}}]}}

Error enriching Phabricator with elasticgirl.15

I've found the following error while trying to analyze a Phabricator (Maniphest) instance with the release elasticgirl.15

2017-09-15 18:15:43,690 - urllib3.connectionpool - DEBUG - https://alexandria.****:443 "POST /data/maniphest_bitergia_170915/_search?scroll=10m&size=1000 HTTP/1.1" 200 26668191
2017-09-15 18:15:45,160 - grimoire_elk.arthur - ERROR - Traceback (most recent call last):
  File "./grimoire_elk/arthur.py", line 522, in enrich_backend
    enrich_count = enrich_items(ocean_backend, enrich_backend)
  File "./grimoire_elk/arthur.py", line 287, in enrich_items
    total= enrich_backend.enrich_items(ocean_backend)
  File "./grimoire_elk/elk/enrich.py", line 344, in enrich_items
    rich_item = self.get_rich_item(item)
  File "./grimoire_elk/elk/phabricator.py", line 249, in get_rich_item
    self.__fill_phab_ids(item['data'])
  File "./grimoire_elk/elk/phabricator.py", line 235, in __fill_phab_ids
    self.phab_ids_names[item['fields']['ownerData']['phid']] = item['fields']['ownerData']['userName']
TypeError: 'NoneType' object is not subscriptable

unable to create indexes in ElasticSearch

I am following the commands mentioned in the docs

$ p2o.py --enrich --index git_raw --index-enrich git \
  -e http://localhost:9200 --no_inc --debug \
  git https://github.com/grimoirelab/perceval.git

$ p2o.py --enrich --index git_raw --index-enrich git \
  -e http://localhost:9200 --no_inc --debug \
  git https://github.com/grimoirelab/GrimoireELK.git

the output I should get
elasticsearch-index
the output I am getting
screenshot from 2018-02-20 23-49-07
there is no index in data. How can i resolve this issue?
I am using docker run -d -p 9200:9200 -p 5601:5601 nshou/elasticsearch-kibana to run elasticsearch and kibana

Question: correct p2o command to load Discourse data?

Hi,

I want to add Discourse data to my dashboard. I read some of the doc files in the perceval and grimoire-elk repo's, but they are not consistent. Would the following command work?

p2o.py --enrich --index discourse_raw --index-enrich discourse -e http://localhost:9200 discourse 'https://my.discourse.com' --from-date '2017-01-01'

In the backend file for Discourse, I read about the param api_key. Do I need to add this, or not?

Thanks

grimoire-kidash pip package not working

Trying to use the pip package grimoire-kidash:

kidash.py 
Traceback (most recent call last):
  File "/usr/local/bin/kidash.py", line 4, in <module>
    __import__('pkg_resources').run_script('grimoire-kidash==0.22.1', 'kidash.py')
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 2927, in <module>
    @_call_aside
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 2913, in _call_aside
    f(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 2940, in _initialize_master_working_set
    working_set = WorkingSet._build_master()
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 635, in _build_master
    ws.require(__requires__)
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 943, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 829, in resolve
    raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'grimoire-elk==0.22.1' distribution was not found and is required by grimoire-kidash

pip package 0.26 up to date? missing backend in util.py (slack)

I upgraded my python packages yesterday, and I am now on grimoire-elk 0.26.5, perceval 0.7.0.

I upgraded so I can make use of the Slack backend. If I run my import script, all works, so I know the updates should be ok. Now I want to try this:

p2o.py --enrich --index slack_raw --index-enrich slack -e http://localhost:9200 --debug slack C06BF20AH --from-date 2015-06-01 -t my-slack-token

I get the follow errors:

2017-04-17 08:36:20,130 Debug mode activated
Traceback (most recent call last):
File "/home/gelk/venvs/grimoirelab/bin/p2o.py", line 127, in
args.index, args.index_enrich, args.project)
File "/home/gelk/venvs/grimoirelab/lib/python3.4/site-packages/grimoire_elk/arthur.py", line 53, in feed_backend
raise RuntimeError("Unknown backend %s" % backend_name)
RuntimeError: Unknown backend slack
(grimoirelab) gelk@grim:~$

If I look in arthur.py, this gets its connector through utils.py. If I check this file, there is no import for slack. I can see this import on GitHub.

I am wondering if the grimoire-elk package on pypi contains this new file utils.py? Or am I doing something else wrong?

It looks like my utils.py is of an older version.

Elasticsearch 6.x and bulk indexing

Bulk indexing is broken when using ES 6 (6.1.1) chaoss/grimoirelab-perceval#270

The error can be reproduced by executing the method test_run in mordred/tests/test_task_collection.py

INFO:grimoire_elk.arthur:Feeding Ocean from git (https://github.com/grimoirelab/GrimoireELK)
ERROR:grimoire_elk.elk.elastic:Error creating ES mappings {"error":"Content-Type header [] is not supported","status":406}
WARNING:grimoire_elk.elk.elastic:Can add mapping http://localhost:9200/git_test-raw/items/_mapping: 
        {
          "dynamic_templates": [
            { "notanalyzed": {
                  "match": "*",
                  "match_mapping_type": "string",
                  "mapping": {
                      "type":        "string",
                      "index":       "not_analyzed"
                  }
               }
            }
          ]
        } 
INFO:grimoire_elk.ocean.conf:Creating OceanConf index http://localhost:9200/conf
INFO:grimoire_elk.ocean.elastic:Incremental from: None
INFO:perceval.backends.core.git:Fetching commits: 'https://github.com/grimoirelab/GrimoireELK' git repository from 1970-01-01 00:00:00+00:00; all branches
INFO:grimoire_elk.ocean.elastic:Adding items to Ocean for <grimoire_elk.ocean.git.GitOcean object at 0x7fc1928176d8> (100 items)
Traceback (most recent call last):
  File "../grimoire_elk/arthur.py", line 196, in feed_backend
    ocean_backend.feed()
  File "../grimoire_elk/ocean/elastic.py", line 201, in feed
    self.feed_items(items)
  File "../grimoire_elk/ocean/elastic.py", line 219, in feed_items
    self._items_to_es(items_pack)
  File "../grimoire_elk/ocean/elastic.py", line 247, in _items_to_es
    self.elastic.bulk_upload_sync(json_items, field_id)
  File "../grimoire_elk/elk/elastic.py", line 195, in bulk_upload_sync
    total += self.bulk_upload(items_pack, field_id)
  File "../grimoire_elk/elk/elastic.py", line 137, in bulk_upload
    self._safe_put_bulk(url, bulk_json)
  File "../grimoire_elk/elk/elastic.py", line 101, in _safe_put_bulk
    res.raise_for_status()
  File "/home/slimbook/.local/lib/python3.5/site-packages/requests/models.py", line 935, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 406 Client Error: Not Acceptable for url: http://localhost:9200/git_test-raw/items/_bulk
ERROR:grimoire_elk.arthur:Error feeding ocean from git (https://github.com/grimoirelab/GrimoireELK): 406 Client Error: Not Acceptable for url: http://localhost:9200/git_test-raw/items/_bulk

Refactorize how items are fetched from Perceval

The current implementation for fetching items from perceval backends is messy. It uses BackendCommand to parse the arguments and calls directly to fetch and fetch_from_cache methods of 'Backend' class. See:

https://github.com/chaoss/grimoirelab-elk/blob/master/grimoire_elk/arthur.py#L119
https://github.com/chaoss/grimoirelab-elk/blob/master/grimoire_elk/ocean/elastic.py#L154

There are better ways to support this case. For instance, using only BackendCommand instances or calling to backend.fetch or backend.fetch_from_archive functions.

The current implementation is also broken for Perceval >= 0.9.11 (see #222).
Issue #224 is also related to this task.

KeyError: 'updated_on' from ocean/elastic.py

2016-04-07 12:47:46,309 Error feeding ocean from git (http://git.eclipse.org/gitroot/gemini.naming/org.eclipse.gemini.naming.git): 'updated_on'
Traceback (most recent call last):
  File "/home/bitergia/GrimoireELK/utils/grimoire/arthur.py", line 75, in feed_backend
    ocean_backend.feed()
  File "/home/bitergia/GrimoireELK/utils/grimoire/ocean/elastic.py", line 125, in feed
    self.add_update_date(item)
  File "/home/bitergia/GrimoireELK/utils/grimoire/ocean/elastic.py", line 89, in add_update_date
    entry_lastUpdated = datetime.fromtimestamp(item['updated_on'])
KeyError: 'updated_on'

Error on analyzing gitlog

I have Perceval, Elastic and Kibana running (all tested).

I am following usecase https://github.com/grimoirelab/use_cases/blob/master/github/README.md. At the step where I clone my own repo, and than want to analyze it with the command:

python3 ./p2o.py -e http://localhost:9200 --no_inc --debug git /tmp/bicho-gitlog.log

I get the following errors (I am using the fosdem branch):

2016-06-19 14:13:08,296 Commit 29de50cf830480f0cbada8aed59d127d57ac9727 parsed
2016-06-19 14:13:08,296 Commit 1f9c503bb7e3009b7ce373d1f96c34768e9b7280 parsed
2016-06-19 14:13:08,297 Adding items to Ocean for <grimoire.ocean.git.GitOcean object at 0x7f00b25f9748> (100 items)
2016-06-19 14:13:08,297 Adding items to http://localhost:9200/git__tmp_ezplatform-gitlog.log/items/_bulk (in 100 packs)
2016-06-19 14:13:08,302 Error feeding ocean from git (/tmp/ezplatform-gitlog.log): 'commit'
Traceback (most recent call last):
File "/home/osboxes/grimoire/GrimoireELK/utils/grimoire/arthur.py", line 76, in feed_backend
ocean_backend.feed()
File "/home/osboxes/grimoire/GrimoireELK/utils/grimoire/ocean/elastic.py", line 116, in feed
self._items_to_es(items_pack)
File "/home/osboxes/grimoire/GrimoireELK/utils/grimoire/ocean/elastic.py", line 144, in _items_to_es
self.elastic.bulk_upload_sync(json_items, field_id)
File "/home/osboxes/grimoire/GrimoireELK/utils/grimoire/elk/elastic.py", line 110, in bulk_upload_sync
bulk_json += '{"index" : {"_id" : "%s" } }\n' % (item[field_id])
KeyError: 'commit'
2016-06-19 14:13:08,303 Adding repo to Ocean http://localhost:9200//conf/repos/git__tmp_ezplatform-gitlog.log {'backend_name': 'git', 'error': KeyError('commit',), 'backend_params': ['/tmp/ezplatform-gitlog.log'], 'success': False, 'repo_update': '2016-06-19T14:13:08.303627'}
Traceback (most recent call last):
File "./p2o.py", line 99, in
args.backend, args.backend_args)
File "/home/osboxes/grimoire/GrimoireELK/utils/grimoire/arthur.py", line 94, in feed_backend
ConfOcean.add_repo(es_index, repo)
File "/home/osboxes/grimoire/GrimoireELK/utils/grimoire/ocean/conf.py", line 67, in add_repo
requests.post(url, data = json.dumps(repo))
File "/usr/lib64/python3.4/json/init.py", line 230, in dumps
return _default_encoder.encode(obj)
File "/usr/lib64/python3.4/json/encoder.py", line 192, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib64/python3.4/json/encoder.py", line 250, in iterencode
return _iterencode(o, 0)
File "/usr/lib64/python3.4/json/encoder.py", line 173, in default
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: KeyError('commit',) is not JSON serializable
osboxes@osboxes:~/grimoire/GrimoireELK/utils>

The TypeError using gerrit and p2o.py

I'm using the p2o.py with the gerrit parameter to retrieve and enrich data from my gerrit repo. but I receive the TypeError in grimoire_elk/ocean/gerrit.py, line 35

File "/usr/local/lib/python3.5/dist-packages/grimoire_elk/ocean/gerrit.py", line 35, in _fix_item item["ocean-unique-id"] = item["data"]["number"]+"_"+item['origin'] TypeError: unsupported operand type(s) for +: 'int' and 'str'

Anybody's had the same issue?

Error when trying to install dependencies for p2o

When trying to run p2o (just up to the point of showing the --help output) I don't see how to fulfill the dependency on MySQLdb:

$ python3 ./p2o.py --help
Traceback (most recent call last):
  File "./p2o.py", line 31, in <module>
    from grimoire.arthur import feed_backend, enrich_backend
  File "/home/jgb/src/grimoirelab/GrimoireELK/utils/grimoire/arthur.py", line 34, in <module>
    from grimoire.utils import get_elastic
  File "/home/jgb/src/grimoirelab/GrimoireELK/utils/grimoire/utils.py", line 49, in <module>
    from grimoire.elk.bugzilla import BugzillaEnrich
  File "/home/jgb/src/grimoirelab/GrimoireELK/utils/grimoire/elk/bugzilla.py", line 32, in <module>
    from .enrich import Enrich
  File "/home/jgb/src/grimoirelab/GrimoireELK/utils/grimoire/elk/enrich.py", line 28, in <module>
    import MySQLdb
ImportError: No module named 'MySQLdb'

But AFAIK, MySQLdb cannot be installed for Python3 (see this question in StackOverflow). And in fact, when I try to install MySQL-python, I get errors that I cannot solve. I can install PyMySQL, but it seems that p2o is looking specifically for MySQLdb. So, what can I do?

Querying using python while running grimoire in Docker

I was working on calculating the time of close of issues.

When I'm using p2o.py, I am able to load a particular repository in my database and they have all the required fields to work with. This accomplishes the task easily.

When I run grimoire/full in docker and expose the 9200 port I am able to access the ES database from my py/ipynb script. But by default, the entire grimoire project is loaded. I tried reading through the settings but could not find a way to restrict it to loading a particular repository.
I only want to produce dashboards and database for grimoire-perceval, not the full project.
What should be the command?

Current Command:
docker run -p 127.0.0.1:5601:5601 -p 9200:9200 -v $(pwd)/credentials.cfg:/mordred-override.cfg -t grimoirelab/full

Missing module 'sortinghat'

I've tried to use p2o without Sorting Hat, and I get this error message:

python3 p2o.py --enrich --index git_yarn -e http://localhost:9200 --no_inc --debug git https://github.com/yarnpkg/yarn.git
Traceback (most recent call last):
  File "p2o.py", line 31, in <module>
    from grimoire.arthur import feed_backend, enrich_backend
  File "/home/jsmanrique/devel/GrimoireELK/utils/grimoire/arthur.py", line 35, in <module>
    from grimoire.utils import get_elastic
  File "/home/jsmanrique/devel/GrimoireELK/utils/grimoire/utils.py", line 59, in <module>
    from grimoire.elk.git import GitEnrich
  File "/home/jsmanrique/devel/GrimoireELK/utils/grimoire/elk/git.py", line 34, in <module>
    from grimoire.elk.sortinghat import SortingHat
  File "/home/jsmanrique/devel/GrimoireELK/utils/grimoire/elk/sortinghat.py", line 30, in <module>
    from sortinghat import api
ImportError: No module named 'sortinghat'

Docker image fails to create Elastic Search index

sudo docker run --name gelk -it --rm --link elasticsearch:elasticsearch --link mariadb:mariadb --link redis:redis bitergia/gelk

Traceback (most recent call last):
  File "arthur/bin/arthurd", line 209, in <module>
    main()
  File "arthur/bin/arthurd", line 84, in main
    writer = ElasticItemsWriter(args.es_index)
  File "/home/bitergia/arthur/arthur/writers.py", line 98, in __init__
    was_created = self.create_index(self.idx_url, clean=clean)
  File "/home/bitergia/arthur/arthur/writers.py", line 173, in create_index
    raise ElasticSearchError(cause=cause)
arthur.writers.ElasticSearchError: Error creating Elastic Search index http://elasticsearch:9200/ocean

Add gender support in the identities module in GELK

According to chaoss/grimoirelab#69, there is a need to support gender in the GrimoireLab toolchain. And grimoirelab-elk is one of the pieces of this toolchain.

With this respect, this task expects as outcome to have an update in the identities module where GELK asks SortingHat through the API about certain aspects of the community member.

This is just an initial requirements, so further discussion could be part of this ticket.

mbox backend and p2o.py

I followed Peceval tutorials and created mbox files in a directory named "archives" (https://grimoirelab.gitbooks.io/training/content/perceval/mail.html)

I would like to use p2o.py to create an index from mbox backend
Unfortunately, I have this error:

(grimoireelk) assadm@assadm-ThinkPad-T470p:~$ p2o.py --enrich --index mbox_raw --index-enrich mbox e http://localhost:9200 --no_inc --debug mbox httpd-announce archives
Traceback (most recent call last):
  File "/home/assadm/venvs/grimoireelk/bin/p2o.py", line 132, in <module>
    args.index, args.index_enrich, args.project)
  File "/home/assadm/venvs/grimoireelk/lib/python3.5/site-packages/grimoire_elk/arthur.py", line 56, in feed_backend
    raise RuntimeError("Unknown backend %s" % backend_name)
RuntimeError: Unknown backend e

For the record, I have no issues with git backend, as in the tutorial.

File docker/compose/gelk.yml has developer-dependent files

The following lines in docker/compose/gelk.yml:

nginx:
    image: nginx
    links:
        - gelk
    ports:
        - 9181:9181
    volumes:
        - ~/devel/GrimoireELK/docker/compose/rq-dash-nginx.conf:/etc/nginx/nginx.conf:ro
        - ~/devel/GrimoireELK/docker/compose/rq-dash-password:/etc/nginx/rq-dash-password:ro

The two lines in volumes assume that this repository has been cloned under ~/devel/GrimoireELK, but that's probably only the place where the developer cloned it. In general, it should be relative to the place where it is cloned in each environment.

I couldn't find a way for specifying relative paths for this file (or at least, none that worked for me), that's why I don't include a pull request. But when I changed those two paths to the actual location of my repo, I could finally run the docker containers as intended.

Error finding out Gerrit version due to port parameter

Using elasticgirl.19 I've seen that the Gerrit collection and enrichment is not working. This is due to the following GrimoireELK error:

2017-11-14 10:49:18,497 - perceval.backends.core.gerrit - ERROR - gerrit cmd ssh  -p 29418 [email protected] gerrit  version  failed: Command 'ssh  -p 29418 [email protected] gerrit  version ' returned non-zero exit status 255                                           
ssh: connect to host review.gluster.org port 29418: No route to host                           
2017-11-14 10:50:18,676 - perceval.backends.core.gerrit - ERROR - gerrit cmd ssh  -p 29418 [email protected] gerrit  version  failed: Command 'ssh  -p 29418 [email protected] gerrit  version ' returned non-zero exit status 255                                           
2017-11-14 10:52:18,771 - grimoire_elk.arthur - ERROR - Error feeding ocean from gerrit (review.gluster.org): ssh  -p 29418 [email protected] gerrit  version  failed 3 times. Giving up!                                            
Traceback (most recent call last):             
  File "./grimoire_elk/arthur.py", line 130, in feed_backend                                   
    ocean_backend.feed()                       
  File "./grimoire_elk/ocean/elastic.py", line 204, in feed                                    
    for item in items:                         
  File "/usr/local/lib/python3.4/dist-packages/perceval-0.9.3-py3.4.egg/perceval/backend.py", line 360, in decorator                                                                          
    for data in func(self, *args, **kwargs):   
  File "/usr/local/lib/python3.4/dist-packages/perceval-0.9.3-py3.4.egg/perceval/backends/core/gerrit.py", line 88, in fetch                                                                  
    if self.client.version[0] == 2 and self.client.version[1] == 8:                            
  File "/usr/local/lib/python3.4/dist-packages/perceval-0.9.3-py3.4.egg/perceval/backends/core/gerrit.py", line 339, in version                                                               
    raw_data = self.__execute(cmd)             
  File "/usr/local/lib/python3.4/dist-packages/perceval-0.9.3-py3.4.egg/perceval/backends/core/gerrit.py", line 325, in __execute                                                             
    raise RuntimeError(cmd + " failed " + str(self.MAX_RETRIES) + " times. Giving up!")        
RuntimeError: ssh  -p 29418 [email protected] gerrit  version  failed 3 times. Giving up!  

I've manually executed the command with the same result. But, if we remove the port it works ...

$ docker exec -it glusterfs_mordred_1 ssh  -p 29418 [email protected] gerrit  version                                        
ssh: connect to host review.gluster.org port 29418: No route to host                           
$ docker exec -it glusterfs_mordred_1 ssh [email protected] gerrit  version                                                  
gerrit version 2.13.9      

Imports perceval.mozilla backends, which don't seem to exist in the perceval repository?

Running p2o.py from a git checkout is looking for perceval.backends.mozilla:

Traceback (most recent call last):
  File "./GrimoireELK/utils/p2o.py", line 31, in <module>
    from grimoire_elk.arthur import feed_backend, enrich_backend
  File "/home/aquarius/Work/LQ/PercevalBackends/GrimoireELK/utils/grimoire_elk/arthur.py", line 35, in <module>
    from .utils import get_elastic
  File "/home/aquarius/Work/LQ/PercevalBackends/GrimoireELK/utils/grimoire_elk/utils.py", line 116, in <module>
    from perceval.backends.mozilla.kitsune import Kitsune, KitsuneCommand
ImportError: No module named 'perceval.backends.mozilla'

but as far as I can tell, the perceval repository at https://github.com/grimoirelab/perceval/tree/master/perceval/backends doesn't have such a thing at all; there's perceval.backends.core, but no perceval.backends.mozilla. Do I have some sort of version mismatch?

Error enriching a Git item

Using elasticgirl.23. Traceback:

2017-11-27 04:17:42,216 - grimoire_elk.arthur - ERROR - Traceback (most recent call last):
  File "./grimoire_elk/arthur.py", line 445, in enrich_backend
    filter_raw_should)
  File "./grimoire_elk/arthur.py", line 316, in get_ocean_backend
    last_enrich = get_last_enrich(backend_cmd, enrich_backend)
  File "./grimoire_elk/elk/utils.py", line 167, in get_last_enrich
    last_enrich = enrich_backend.get_last_update_from_es([filter_])
  File "./grimoire_elk/elk/enrich.py", line 447, in get_last_update_from_es
    last_update = self.elastic.get_last_date(self.get_incremental_date(), _filters)
  File "./grimoire_elk/elk/elastic.py", line 211, in get_last_date
    last_date = self.get_last_item_field(field, _filters=_filters)
  File "./grimoire_elk/elk/elastic.py", line 264, in get_last_item_field
    res_json = res.json()
  File "/usr/local/lib/python3.4/dist-packages/requests/models.py", line 892, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib/python3.4/json/__init__.py", line 318, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.4/json/decoder.py", line 343, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.4/json/decoder.py", line 361, in raw_decode
    raise ValueError(errmsg("Expecting value", s, err.value)) from None
ValueError: Expecting value: line 1 column 1 (char 0)

2017-11-27 04:17:42,231 - grimoire_elk.arthur - ERROR - Error enriching ocean from git (https://github.com/fluent/fluent-plugin-s3.git): Expecting value: line 1 column 1 (char 0)

Add gelk version to panel JSON exported by Kidash

It would be very useful to store Gelk version together with panels. This way we will know whether data in an index is fully compatible with that panel.

Kidash could export highest version found in the index where the panel is exported from and store it as a new field in panel's JSON file. As one panel can deal with several indices, the object could be something like:

'gelk_version': {
    'git': 'x.x.x',
    'gerrit': 'x.x.x',
    ...
}

This feature outcome will allow to develop chaoss/grimoirelab-sigils#108

Project mapping is not working for Bugzilla (it shows unknown project)

The project widget shows a project named unknown instead of the project name defined in the project spreadsheet.

screenshot png

#!/bin/bash

# elasticgirl.24 + https://phabricator.bitergia.net/T4611#117246
# + latest client for meetup + meetup_enrich fix

SORTINGHAT='61b1c8930a427e54b947831b346ec09ef54ce11c'
GRIMOIREELK='2327d83f1f380427a6c1acf4e4ec8972dbef7385'
PERCEVAL='17d9cb891fed0c5cf584d9a370021ef48190115c'
KIBITER='f7bf173cb1d79418b9909d877adfb998ce7b52da'
GRIMOIRELAB_GITHUB_IO='c416f5ccffce41c28e980ff8d882ce91673bf90b'
ARTHUR='3efff3311b52222b7d06bf63ce97f8a42572bc06'
USE_CASES='31ea49ecc28f3e27aa29fddb0e01b7f475069b6a'
PANELS='8e0ed4766e540ba4aac23e401fdfc57d59fda4c6'
MORDRED='ea3335e4983ed39d3a537592af079f75f70b7863'
TRAINING='9df30c3ec955387a423f261bb141f74133bf7ff7'
PERCEVAL_MOZILLA='a7a3e2d79613e6ef5f36a34129b38fae0c7acae8'
PERCEVAL_PUPPET='9c0d4c72a0e9b3c7c1e3295f72ebe70c5c874084'
REPORTS='b965c06cd7c442c801c09e0447c5cbd8a90060d3'
GRIMOIRELAB_TOOLKIT='49da2a7efd4605f8ca71b3cf283e9d62c3f02647'
PERCEVAL_OPNFV='0bec845df7e17965892dfb1181da0ab1c77168dc'
GRIMOIRELAB='0799e29c8e74847b25577c4ba7892221dad8189c'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.