Dear Achilles Team,
Please forgive me for taking the liberty to do some "pre-release" testing on your code. The truth is, I am working on an ETL for a nursing home dataset and could not resist the temptation tao test Achilles because I think that its summary reports could help me more rapidly see the influence of some of the decisions I have to make while loading the dataset.
In case its helpful, I have written below a brief account of my experience with installing and running the tool in its current state:
Installation
I installed Achilles and its dependencies on a machine running Ubuntu Precise (12.04). I had to upgrade my R from 2.x to 3.1. The instructions here were helpful (http://stackoverflow.com/questions/10476713/how-to-upgrade-r-in-ubuntu)
The dependencies that I noted for Achilles were:
-- Rcpp (Cran)
-- RJDBC (Cran)
-- rJava (Cran)
-- rjson (Cran)
-- DBI (Cran)
-- SqlRender (OHDSI)
-- DatabaseConnector (OHDSI)
Most of these installed from the command line with no issues. For Cran, I installed in R running as root like so:
# install.packages("Rcpp", lib="/usr/local/lib/R/site-library/")
For OHDSI packages, I installed like so:
sudo R CMD INSTALL SqlRender/ -l /usr/local/lib/R/site-library/
In SqlRender I had to comment out the include for _mingw.h present in SqlTranslate.cpp and SqlSplit.cpp. This seems to be a Windows compatability requirement? Another issue was that there was an include in SqlRender.h that seemed spelled incorrectly:
/*#include "SQLRender.h"*/
#include "SqlRender.h"
Finally, I installed Achilles:
sudo R CMD INSTALL Achilles/ -l /usr/local/lib/R/site-library/
Running Achilles
After installation, I followed the example provided in the help page and figured out how to connect to the DB with my nursing home data
> library("Achilles")
> connectionDetails <- createConnectionDetails(dbms="oracle",user="...",password="...",server="...",schema="DIKB_DEV")
> achillesResults <- achilles(connectionDetails, "DIKB_DEV", "DIKB_DEV", "TestDB")
Connecting using Oracle driver
Executing multiple queries. This could take a while
|======================================================================| 100%
Analysis took 16.4 secs
Done. Results can now be found in DIKB_DEV
The database had new tables loaded with data and R held a loaded achillesResults dataframe.
I then tried to run 'plot' per the example in the help page but ran into an error:
> plot(achillesResults, "population")
Error in xy.coords(x, y, xlabel, ylabel, log) :
'x' and 'y' lengths differ
Not sure what happened here - any ideas?
Next, I wanted to play with AchillesWeb. I figured the exportToJson function was important for generating data that the web app could load so gave it a try. It almost worked:
> exportToJson(connectionDetails, "DIKB_DEV", "DIKB_DEV")
Warning: folder /home/PITT/rdb20/GeriOMOP already exists
Connecting using Oracle driver
Generating person reports
|======================================================================| 100%
Generating observation period reports
|======================================================================| 100%
Generating condition treemap
|======================================================================| 100%
Generating condition reports
| | 0%
Error executing SQL: Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ", : Unable to retrieve JDBC result...
The error file held the SQL query that failed (below). Running it against the DB yielded the following error:
ORA-01722: invalid number
01722. 00000 - "invalid number"
*Cause:
*Action:
Its no clear to me what happened. One issue I had with OSIM2 that might be relevant here is that the nursing home population does not have certain age groups. This can trip up code that assumes all age groups will have some members. Anything else I can test to help find out whats going on?
SQL Query that failed:
DBMS:
oracle
Error:
Unable to retrieve JDBC result set for select c1.concept_id as condition_concept_id,
c1.concept_name as condition_concept_name,
c2.concept_group_id as concept_id,
c2.concept_group_name as concept_name,
sum(ar1.count_value) as count_value
from ACHILLES_results ar1
inner join
DIKB_DEV.concept c1
on ar1.stratum_1 = c1.concept_id
inner join
(
select concept_id,
case when concept_name like 'Inpatient%' then 10
when concept_name like 'Outpatient%' then 20
else concept_id end
+
case when (concept_name like 'Inpatient%' or concept_name like 'Outpatient%' ) and (concept_name like '%primary%' or concept_name like '%1st position%') then 1
when (concept_name like 'Inpatient%' or concept_name like 'Outpatient%' ) and (concept_name not like '%primary%' and concept_name not like '%1st position%') then 2
else 0 end as concept_group_id,
case when concept_name like 'Inpatient%' then 'Claim- Inpatient: '
when concept_name like 'Outpatient%' then 'Claim- Outpatient: '
else concept_name end
+
case when (concept_name like 'Inpatient%' or concept_name like 'Outpatient%' ) and (concept_name like '%primary%' or concept_name like '%1st position%') then 'Primary diagnosis'
when (concept_name like 'Inpatient%' or concept_name like 'Outpatient%' ) and (concept_name not like '%primary%' and concept_name not like '%1st position%') then 'Secondary diagnosis'
else '' end as concept_group_name
from DIKB_DEV.concept
where vocabulary_id = 37
) c2
on ar1.stratum_2 = c2.concept_id
where ar1.analysis_id = 405
group by c1.concept_id,
c1.concept_name,
c2.concept_group_id,
c2.concept_group_name (ORA-01722: invalid number
)
SQL:
select c1.concept_id as condition_concept_id,
c1.concept_name as condition_concept_name,
c2.concept_group_id as concept_id,
c2.concept_group_name as concept_name,
sum(ar1.count_value) as count_value
from ACHILLES_results ar1
inner join
DIKB_DEV.concept c1
on ar1.stratum_1 = c1.concept_id
inner join
(
select concept_id,
case when concept_name like 'Inpatient%' then 10
when concept_name like 'Outpatient%' then 20
else concept_id end
+
case when (concept_name like 'Inpatient%' or concept_name like 'Outpatient%' ) and (concept_name like '%primary%' or concept_name like '%1st position%') then 1
when (concept_name like 'Inpatient%' or concept_name like 'Outpatient%' ) and (concept_name not like '%primary%' and concept_name not like '%1st position%') then 2
else 0 end as concept_group_id,
case when concept_name like 'Inpatient%' then 'Claim- Inpatient: '
when concept_name like 'Outpatient%' then 'Claim- Outpatient: '
else concept_name end
+
case when (concept_name like 'Inpatient%' or concept_name like 'Outpatient%' ) and (concept_name like '%primary%' or concept_name like '%1st position%') then 'Primary diagnosis'
when (concept_name like 'Inpatient%' or concept_name like 'Outpatient%' ) and (concept_name not like '%primary%' and concept_name not like '%1st position%') then 'Secondary diagnosis'
else '' end as concept_group_name
from DIKB_DEV.concept
where vocabulary_id = 37
) c2
on ar1.stratum_2 = c2.concept_id
where ar1.analysis_id = 405
group by c1.concept_id,
c1.concept_name,
c2.concept_group_id,
c2.concept_group_name