Git Product home page Git Product logo

datalad.org's Introduction

Source datalad.org

This repository contains the source code for the DataLad website: https://www.datalad.org/

It is built with the Python-powered static site generator Pelican

Run locally

First clone the repository and install submodules:

git clone https://github.com/datalad/datalad.org.git
cd datalad.org
git submodule update --init --recursive

Then create a virtual environment and install pelican:

python3 -m venv ~/.venvs/pelican
source ~/.venvs/pelican/bin/activate
pip3 install pelican

Then run the command to serve the website locally.

pelican --autoreload --listen

The local website will update in real-time if changes are made to the source code.

Contributing

Contributions are welcome! Please:

  • fork this repository
  • create a new branch
  • add and commit your changes
  • ensure that your changes render locally
  • push your commits to your fork
  • create a pull request to the upstream master branch, with a description of your proposed changes

If your contributions do not involve specific changes to the code, please create an issue.

datalad.org's People

Contributors

adswa avatar aqw avatar bpoldrack avatar chrhaeusler avatar erramuzpe avatar jsheunis avatar loj avatar maarten-vermeyen avatar mih avatar tamaracha avatar yarikoptic avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

datalad.org's Issues

Update datalad.org with comprehensive picture on ecosystem size/versatility

In the spirit of datalad/datalad#4490 we should maintain a central place with information that offers people the possibility to grasp what datalad is about -- which include the various side projects, efforts, and use cases that are floating around.

ATM we have lots of information, but it is scattered, and it needs one of us to manually explain what is around.

It makes sense to put that information on datalad.org

Offer promotional material for download

I have been asked for something like a "DataLad-managed" badge that could be put on a poster, but it would also make sense to have this be available for github repos etc.

"[email protected]" help email not working?

Hi-

I've emailed from two separate accounts to the email "[email protected]", which is listed here:
https://www.datalad.org/development.html
... and both times, I receive bounceback messages after a couple days:
The following message to [email protected] was undeliverable.
The reason for the problem:
5.4.7 - Delivery expired (message too old) 'timeout'

Is that email not working, since the same thing is happening from separate accounts of mine?

--pt

apt-get datalad gets version 11

Could you update the apt-get package? It is difficult to teach datalad when I cannot easily get an up-to-date package ; (

Replace/upgrade www.datalad.org

The current page is outdated and could do with a revamp. Keep the new page lean, possibly single page.

Ideas:

  • Remove unnecessary content
  • Add pointers to:
    • the handbook
    • datasets.datalad.org
    • github.com/datalad/datalad
    • the publications
    • "try on binder" demo
  • the integrations page's content is important, but not its composition
  • there should be room for a future pointer to a greater dataset catalog
  • a comprehensive list of support/interoperable services

Comments welcome.

Example involving existing source code?

As a programmer doing some machine learning and interested in datalad for reproducibile work and some annex features, what would encourage me to get my feet wet with datalad is an example showing how to import some existing source code (presumably using datalad install?) without my code repo getting too entangled with datalad / git annex.

I want datalad to do its thing by complaining if I'm running steps I want to be reproducible but haven't committed my code -- I guess datalad's answer to this is to make the source code repo a subdataset of my datalad repo(s)? However, if I can avoid it I don't want my source code github repos ending up with a lot of git annex / datalad branches & submodules & metadata that I don't yet understand, or accidentally add code files annexed when they should be unannexed, or get in the way of working with the source code as a separate repo rather than as a datalad "subdataset" submodule.

Does datalad support that? If so, a short example along these lines would go a long way towards "yes this is the right tool for me, let's try it" :-)

publishing rules require root login, would remove _files, etc

there was substantial refactoring of Makefile hardcoding settings which imho should be up to ~/.ssh/config settings instead. Makefile should just know "whereto". The rest (specifics of the ssh access) should be within ~/.ssh/config since they might vary across ppl/networks/situations.
Also I feel real unease with any automated action accessing critical servers as root -- was there a real demand for that?
It is somewhat of a convention now that we have "free floating" _files on our projects' websites (to just host some supplemental materials outside of VCS/generation)... so they must not be removed and if hosting is moved, should move along.

Additional page with asciinemas

Features page serves us a tutorial and eventually wouldn't/shouldn't be used for all demos. We need additional page with more demos, which we could possibly reference from features of someone wants more examples on a specific topic

Improve our screencast/asciinemas "presentation"

Since the idea is to not have duplicate presence at asciinema.org website, it would be nice if our "version" of those was more featureful

  • would be nice if some background/thumbnail (e.g. probably something around 3 sec offset from the beginning should work) was there instead of black rectangle as it is currently is. Done in #26
  • need to allow some scaling, either via dedicated buttons (e.g. 80x24, 120x36) or automagically by desktop-vs-mobile rendering. IMHO it is somewhat critical so visitors do not feel crippled while trying to unwrap all very long lines spit out by datalad logs/output . We could easily achieve probably by using the same mechanism as asciinema allowing to specify terminal size: https://asciinema.org/a/134921?cols=120&rows=36 . Was done and then reverted for now in #26
  • great to have "documented and ready to use" script renderings of the casts, but I find them too big to just quickly overview commands. For that I find a rendered "list of commands" with timing and urls to their help to come really handy, see above aciinema for an example. Knowing timing helps to assess when running a command either it is stuck or indeed just should take long time. They are useful to plan presentation/live demos (e.g. if I know that next command will take a while I better start it while still talking), etc. Also would be nice to complement with "full run time" showing total amount of time to give full estimate on how much of wall time this demo would take. cast2asciinema script could be easily adjusted to generate .rst or anything else. Eventually we might even extend with that "narrated" version Michael originally worked out. Example of the .cmds rendered within asciinema description is https://asciinema.org/a/134832

Github pages sanitizer strips complete JS if `intersectionobserver` is included

This piece of code was included at first:

// Add observer to check when typewriter element scrolls into view and add/remove animation class
const observer = new IntersectionObserver(entries => {
  entries.forEach(entry => {
    const console = entry.target.querySelector('.typewriter-text');
    // At intersection, add animation class and return
    if (entry.isIntersecting) {
      console.classList.add('typing');
      return;
    }
    // Not intersecting: remove class
    console.classList.remove('typing');
  });
});
// Tell observer to track the correct element
observer.observe(document.querySelector('.card-img.use-datalad'));

But it seems that Github pages doesn't like this and strips the complete JS script, resulting in other functionality failing as well. I removed it and then the site deployed without the typewriter animation, but with everything else intact.

So we either need a workaround to achieve the same (i.e. getting the typewriter effect to start only when in view), or we need to forego that functionality, fully or partly.

Add custom 404 page

At very least it should allow users to easily flow back to the site. Even better if it's somewhat amusing or insightful.

New site CSS to be refactored using a mobile-first approach

Much of the CSS content was not built using a mobile-first approach. This mostly applies to the @media and min/max-width selectors.

Ideally, a common standard should be used consistently, which requires some refactoring of the CSS code.

Accompany each screencast with the extract of the most relevant commands

so that they are visible on the page.

ATM it is not easy to grasp which commands are involved by just getting to the page like https://www.datalad.org/for/data-publication .

There is a static screenshot of the screencast, and then button to get a script to be opened in some editor, which in my case is actually just suggested to be saved, and then I should open it with my favorite editor, and also visually skip lots of informative comments to just get to the "pearl" of the script - the few commands which are of the primary interest in the example.

Let's take for an example the box.com publishing example for which the full script is:

$> grep -v -e '^#' -e '^ *$' /tmp/boxcom.sh
set -e -u
export GIT_PAGER=cat
datalad create demo
cd demo
datalad run dd if=/dev/urandom of=big.dat bs=1M count=1
. ~/box.com_work.sh
git annex initremote box.com type=webdav url=https://dav.box.com/dav/team/project_one chunk=50mb encryption=none
datalad create-sibling-github --github-organization datalad --publish-depends box.com --access-protocol ssh exchange-demo
datalad publish --to github big.dat
git annex whereis
cd ../
datalad install -s [email protected]:datalad/exchange-demo.git fromgh
datalad siblings -d ~/fromgh enable -s box.com
cd fromgh
git remote -v
datalad get big.dat
ls -sLh big.dat

I think it would be useful to have an excerpt such as below posted right before or after the screencast. We can annotate in the screencast for automated extraction, but not sure about nice formatting for the long lines, but probably could be done as well:

# Publish repository to github and data to box.com
git annex initremote box.com type=webdav \ 
   url=https://dav.box.com/dav/team/project_one \
   chunk=50mb encryption=none
datalad create-sibling-github --github-organization datalad \
   --publish-depends box.com \
   --access-protocol ssh exchange-demo
datalad publish --to github big.dat

If datalad commands (e.g. create-sibling-github) could acquire hyperlinks to the docs.datalad.org for them, would be just great!
What do you think?

Reshare examples on reddit

Thanks @vsoch for the idea, which I like

To promote the greatness of DataLad we should spread the word on popular and generic media about its features. https://www.reddit.com/r/programming/ is one of such possible portals worth sharing at. It should be easy to take our examples, and post there (possibly condenced a bit or just recasted into whatever markup language reddit takes - never posted yet), and link them at the end of each screencasted section. This way we could get possible feedback as well from reddit folks.

Could also be posted as gist on github I guess. Anywhere else?

Come up with a better slogan

In the (current) tagline, the term Data Portal is used. I assume there will be resistance to this, and the term is also used for other things. Unfortunately, the initial suggestion of "Data Store" is even worse IMO, as it implies 1) cost and 2) a completely different noun.[1]

---Alex

[1] The noun I thought of is also apparently what Wikipedia's definition of a Data Store is:

A data store is a repository for persistently storing and managing collections of data which include not just repositories like databases, but also simpler store types such as simple files, emails etc.[1]

A database is a series of bytes that is managed by a database management system (DBMS). A file is a series of bytes that is managed by a file system. Thus, any database or file is a series of bytes that, once stored, is called a data store.

MATLAB[2] and Cloud Storage systems like VMware,[3] Firefox OS[4] use datastore as a term for abstracting collections of data inside their respective applications.

Point to https://github.com/datalad/faq

Awhile back created that repo to absorb questions people might have. We haven't anyhow advocated it so sure thing noone really asked (besides Satra who suggested this idea). I think such FAQ repository is a good way to get/answer generic questions, and we should use it. But we should make it visible

Add humans.txt

Why should robots have all the fun?

I'm considering another recipe this time...

Why bother with two columns on the front page?

left one is already a cripple in comparison to the right one, and altogether in one column the content in the right then could be enhanced with asciinemas or whatnot to immediately demo main glorious features of the datalad (now would be impossible in that skinny column)

Data Discovery tutorial uses incompatible search syntax

Using current master

me@christop ~ $ datalad search -d /// -s author -R haxby
[INFO] Changing // back to /// as it was probably changed by MINGW/MSYS, see http://www.mingw.org/wiki/Posix_path_conversion
[ERROR] unknown arguments: ['-s', '-R', 'haxby']
usage: datalad search [-h] [-d DATASET] [--reindex]
                      [--max-nresults MAX_NRESULTS]
                      [--mode {egrep,textblob,autofield}] [--full-record]
                      [--show-keys {name,short,full}] [--show-query]
                      [QUERY [QUERY ...]]

Data handles?

In the second paragraf of "What is datalad?" we are currently reintroducing the term "data handle". I have no strong opinion on that term, but is this intentional? As far as I'm aware we didn't use it in any recent doc.

v7 repositories?

Is there any documentation on the interaction between git annex repo format and datalad? In particular, is datalad happy with annex v7? Are there any gotchas with using datalad with annex v7 format?

make use/publish animated datalad logo

@mih showed a nice animated datalad logo, but I do not think it is used/published anywhere on datalad.org website. May be it could be used instead of the static one in the left top corner, or may be make it used there only upon initial opening of the website (for that session, so should be no need for cookie), and then getting replaced with static one?

Host on GitHub Pages

@bpoldrack has figured out most of the details for hosting sites on GitHub Pages, including auto-publish on merge/push.

Once we're confident that everything is behaving well, we should use the same technique for datalad.org and take advantage of their CDN.

---Alex

Navbar collapse show/hide functionality for small screens

This functionality does not work correctly yet. I had some trouble with :hover effects not reacting as expected, and then tabled that for later. Now it's later.

Expected functionality:

  • for small screens, all menu items collapse into a vertical list that is initially hidden
  • for small screens, a menu button appears that needs to be hovered over or clicked in order to view the vertical list of menu items.
  • another todo is to add the correct menu icon to the fonts

Tutorial search query does not yield any results

I'm following http://www.datalad.org/for/data-discovery using current master on Windows 10:

me@christop ~ $ datalad search raiders neuroimaging
No DataLad dataset found at current location
Would you like to search the DataLad superdataset at 'C:\\Users\\me\\datalad'? (choices: yes, no):
[INFO] Performing search using DataLad superdataset 'C:\\Users\\me\\datalad'

but independently the search words work:

me@christop ~ $ datalad search raiders
No DataLad dataset found at current location
Would you like to search the DataLad superdataset at 'C:\\Users\\me\\datalad'? (choices: yes, no):
[INFO] Performing search using DataLad superdataset 'C:\\Users\\me\\datalad'
search(ok): C:\Users\me\datalad\hbnssi (dataset)
search(ok): C:\Users\me\datalad\labs\haxby (dataset)
search(ok): C:\Users\me\datalad\labs\haxby\raiders (dataset)
action summary:
  search (ok: 3)
me@christop ~ $ datalad search neuroimaging
No DataLad dataset found at current location
Would you like to search the DataLad superdataset at 'C:\\Users\\me\\datalad'? (choices: yes, no):
[INFO] Performing search using DataLad superdataset 'C:\\Users\\me\\datalad'
[INFO] Reached the limit of 20 top matches, there could be more which were not reported.
search(ok): C:\Users\me\datalad\corr (dataset)
search(ok): C:\Users\me\datalad\crcns (dataset)
search(ok): C:\Users\me\datalad\dbic (dataset)
search(ok): C:\Users\me\datalad\dbic\QA (dataset)
search(ok): C:\Users\me\datalad\dicoms (dataset)
search(ok): C:\Users\me\datalad\dicoms\dartmouth-phantoms (dataset)
search(ok): C:\Users\me\datalad\hbnssi (dataset)
search(ok): C:\Users\me\datalad\indi (dataset)
search(ok): C:\Users\me\datalad\indi\HypnosisBarrios\RawData (dataset)
search(ok): C:\Users\me\datalad\indi\PRIME\mountsinai-P (dataset)
search(ok): C:\Users\me\datalad\indi\PRIME\mountsinai-S (dataset)
search(ok): C:\Users\me\datalad\indi\PRIME\newcastle (dataset)
search(ok): C:\Users\me\datalad\indi\PRIME\nki (dataset)
search(ok): C:\Users\me\datalad\indi\PRIME\princeton (dataset)
search(ok): C:\Users\me\datalad\indi\PRIME\rockefeller (dataset)
search(ok): C:\Users\me\datalad\indi\PRIME\ucdavis (dataset)
search(ok): C:\Users\me\datalad\indi\PRIME\uminn (dataset)
search(ok): C:\Users\me\datalad\labs\gobbini (dataset)
search(ok): C:\Users\me\datalad\labs\gobbini\famface\data (dataset)
search(ok): C:\Users\me\datalad\labs\haxby (dataset)
action summary:
  search (ok: 20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.