Comments (18)
Exactly that's why I wanted to avoid the overhead of multiple packages. But you convinced me. We could call it the bazAARverse with the packages
- c14bazAAR
- oslbazaAAR
- urathorbazAAR
- isotopebazAAR
- adnabazAAR
- ...
and the general helper package bazAAR.
Now we only need a team of 5 and 3 weeks of time 👍. If only we could justify this investment...
from c14bazaar.
We can look into it, perhaps experiment with data import from existing db for the Palaeolithic, as part of @yesdavid's projekt. We'll put it on our to do list.
from c14bazaar.
I will volunteer for the dendro part! Count me in!
from c14bazaar.
From my perspective, it actually does not make so much sense to put OSL and Dendro into a package that is called c14bazaar. That would probably go much better into an oslbazaar and dendrobazaar package. Also, one could envisage extracting the filter functions (that might be useful for all the bazaars) into another package bazaarSanitizaar?
from c14bazaar.
I generally agree with you. My experience with neolithicRC
was that somehow nobody considered it for Bronze Age dates. Weird 😁.
On the other hand it's tedious to maintain and establish multiple packages. I would only invest this time if we have a solid number of expected users.
from c14bazaar.
Admittedly, multiple packages require a bit more coordination. But on the other hand, the amount of code to maintain should approx. stay the same, doesn't it?
from c14bazaar.
Can I detect a hint of sarcasm in those lines? ;-)
from c14bazaar.
Maybe a pinch. But seriously: This would be fantastic. A great standardization challenge, that could advance the field. Some discussions I recently shared with @stschiff inspired me to think in bigger dimensions again.
I only fear we're doing this mostly for ourselves at the moment. Is there are possibility that we reach the critical mass to make this an established tool in our field? Some developments in the last weeks give me hope, but I'm still wondering.
from c14bazaar.
This sounds mysterious. Nevertheless, starting with osl and dendro would be already a (maybe manageable?) enterprise and a 'leap forward'. Although, adna is also very tempting...
Maybe some other parties might be interesting to join in if you agree on that? I will not name them just now, but I could think of some UK or Iberia based scientists who would also benefit from the interface and might invest some time to align it with existing or future packages. What do you think?
from c14bazaar.
I started to work on c14bazAAR because I profited from it for my own research projects (although the work went way beyond that at some points). I think this connection is crucial to do this in a feasible way.
@yesdavid Do you think you would need OSL (and/or Uran-Thorium etc.) datings for your research? If yes, could you imagine creating and maintaining such packages if you receive proper support from us? Maybe this is interesting for @felixriede as well?
@MartinHinz Are you in a position where this would apply to you concerning Dendro-datings? Are there even open (!) databases out there for this kind of data?
I could volunteer to coordinate the process and apply the necessary changes to c14bazAAR to detangle 14C related functions and general functions. I'm sure @dirkseidensticker would be on board as well.
Is this a good way to approach this? A good investment of our time? I see this as a long-term, slow-pace transformation.
from c14bazaar.
What a great discussion! I see great value in the way we approached standardization within c14bazAAR, which could be translated to other kinds of data as well.
Exactly that's why I wanted to avoid the overhead of multiple packages. But you convinced me. We could call it the bazAARverse with the packages
- c14bazAAR
- oslbazaAAR
- urathorbazAAR
- isotopebazAAR
- adnabazAAR
- ...
and the general helper package bazAAR.
Now we only need a team of 5 and 3 weeks of time 👍. If only we could justify this investment...
I am very much thrilled about such an approach! We would need to discuss how the logic we have already should/could be split. A lot of our efforts with regards to standartization were pointed at the metadata that are associated with 14C dates. Especially our approached towards 'thesaurification' are only scratching the surface as of yet.
@nevrome is right, a critical mass is important as well as a focus on research questions that benefit from such 'investments' of time and energy.
Two action points from me:
- Should we introduce this as a possible topic for the hackathon at CAA?
- How many datasets are out there? If most data are only published behind paywalls as supplementary to papers we would not get very far.
Btw: aDRAC contains a few OSL dates as well ... might be a good time to turn myself in 😉
from c14bazaar.
Especially our approached towards 'thesaurification' are only scratching the surface as of yet.
One interesting aspect @yesdavid brought forward: The amount of data we have manually compiled to simplify oddly specific sample material descriptions could be enough to try machine learning. I guess he was joking, but I would love to give this a try one day.
Should we introduce this as a possible topic for the hackathon at CAA?
I got the impression the topic for the Hackathon is already pretty fix. But as this might not be the last one of these events, I think this is a good idea.
How many datasets are out there? If most data are only published behind paywalls as supplementary to papers we would not get very far.
I think this is the kind of data where it might be the most easy to contact the authors and ask for a data publication on a long-term archive.
Btw: aDRAC contains a few OSL dates as well ... might be a good time to turn myself in 😉
Off with his head! But seriously: We should check all databases for this: #92
from c14bazaar.
machine-learning sounds good! the only downside is, that it should be consistent for every user, and you can not trust the machines to do so on the client-side. But on 'server-side' this might be worthwhile
hackathon: not at CAA, but what about a virtual hackathon, or let's call it a sprint on the GitHub repo one day or the other?
paywall: who is not in with open science, is out.
from c14bazaar.
virtual hackathon
I like this idea. Maybe one day in February that we all try to shovel free to lay the foundation in a concerted attack.
from c14bazaar.
I am in!
from c14bazaar.
This sounds all great. Regarding aDNA data, just my five cents: This is so high-dimensional (a million genetic markers aren't uncommon) that it wouldn't fit into the exact same framework as the other data you have (C14, dendro, isotopes), which mostly come down to one number (plus extensive meta-information). One could of course think about summary stats (like Principal Components coordinates or something).
But I like the generic setup of these bazAARverse packages, which would basically try to offer a consistent API into such datasets, perhaps even with cross-compatibility of at least overlapping meta-info fields (say, longitude, latitude, or even somehow universal individual IDs that would link experimental data for the same individual burial).
from c14bazaar.
Thanks for the clarification, I think you are absolutely right. On the other hand, such things like Haplogroup, mt or Y, could be made accessible with that, or just a link for downloading the original high-dimensional data. Still not being very familiar with that topic, but eager to learn, I would already benefit from such a possibility.
from c14bazaar.
Yes, good point. Haplogroups could go into this, for sure, and they are already quite interesting. And of course, if we could even automatically download the full data somehow through a function, that would make people's life a lot easier. I think a lot of this depends on the development of an open and consistent data format for aDNA data and its meta-data, and we're working on that with @nevrome and others. So he's in the right position to help pushing this on this frontend side once we're making progress on the backend side.
from c14bazaar.
Related Issues (20)
- Input argument checking
- openxlsx reading issues HOT 4
- IntChron parser HOT 6
- CRAN submission ToDo list HOT 5
- Country thesaurus includes entries that aren't countries HOT 6
- How to use intcal20 for calibration? HOT 2
- Parsers for Palmisano's datasets HOT 9
- remove TL dates from aDRAC parser
- Dev mode for URL downloading HOT 1
- Retain online lookup tables for backwards compatibility? HOT 4
- Simplify variable reference table
- wrong database encodings HOT 6
- Enable all fields to be returned with get_c14data() HOT 1
- Database versions as calendar dates or explicit version numbers
- fread configuration: colClasses = "character"
- List of datasets for c14bazAAR
- devtools::check() and devtools::check(vignettes = F) errors HOT 5
- get_neonet() HOT 6
- Rework the duplicate removal interface HOT 1
- Database version update
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from c14bazaar.