Git Product home page Git Product logo

learning's Introduction

Data Together

Data Together empowers people to create a decentralized civic layer for the web, leveraging community, trust, and shared interest to steward data they care about.

Find out about who we are, what we do, and how to get involved at https://datatogether.org/)!

Organizational structure

We maintain pretty light governance but commit to an annual in-person meeting and quarterly calls:

Quarterly Calls

Quarterly calls are held four times annually, for everyone, but especially Data Together partners to sync up on ongoing projects, what is going on in their organizations, and more.

📅 Once per quarter
▶️ Call Playlist: youtube.com/playlist?list=PLtsP3g9LafVul1gCctMYGm9sz5FUWr5bu

Working Openly

We have developed guidelines for working as an open project, these are all contained in this repo:

License

Data Together Documentation Materials are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

learning's People

Contributors

allenpg avatar dcwalk avatar ebarry avatar flyingzumwalt avatar jeffreyliu avatar mhucka avatar titaniumbones avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

learning's Issues

Advance Packet for 4S Workshop Attendees

Prepare advance packet for workshop attendees and send it out.

Draft of Pre-materials to send out:

  • Join Slack, get it on your computer / phone
  • Visit datatogether.org?
  • Read EDGI Blog post on “introducing data together”?
  • Be aware that there will be a followup google form request about what data you rely on, and who you reached out to about it, and what you talked about.
  • Fill out RSVP survey (see #2)

Supplies to bring:

  • people could bring their own hardware? if so, what requirements?

Explain the value of signing up

The gitbook lesson we're putting together suggests that people can optionally register on Data Together, but doesn't explain why they would want to do that. What are the benefits of signing up?

Pre-Workshop Survey for Workshop Attendees

Post a survey for workshop attendees to fill out in advance. When ready, send out an email inviting attendees to fill it out.

Questions:
What datasets do you rely on? keywords + 2 or 3 URLs

Tutorial: Replicate a dataset you care about onto hardware that you control

Write a tutorial, based on the style of the dweb primer, showing how to replicate a dataset.

General steps

  1. Install ipfs
  2. Make sure ipfs is installed and working properly
  3. get the hash of the data
  4. Make sure you have enough storage space to hold the data
  5. pin the data onto your machine.

Follow-up info:

  • Pinning rings, ipfs-cluster, and pinning services
  • A recommendation for libraries to run ipfs nodes and to treat pin sets as part of their collections, and to treat nomination of datasets for harvest/replication as part of collection development activities.

Add a Glossary Section

  • Adding a dataset means [...]

  • Harvesting a dataset means [...]

  • Storing a dataset means [...]

  • Public Record means [...]

  • Data Together Nodes means [...]

  • Distributed Data Stewardship means [...]

In the short term we should just add these definitions inline into #17 in steps 4 & 5.

identify hands-on track leaders

Identify people to be "station" leads for the hands-on portion of the workshop.

Must make sure we have people who can help participants use command line on Windows

  • using the command line (must be prepared to support windows, mac and linux)
  • installing IPFS
  • finding the hash you want to replicate and making sure you have enough storage
  • pinning the data on you ipfs node
  • playing with the pinned copy of the data

Note: We know that some people won't get past station 1. That's ok. Getting exposure to the command line in an encouraging, supportive environment is an extremely valuable and empowering learning experience.

Tutorial: Browse datasets that have been backed up

Write a tutorial, based on the style of the dweb primer, showing how to browse through the backed up datsets and how to make sense of the information you see.

This tutorial will need to be updated as the tools and UI evolve...

Privacy of user information on Data Together

If someone signs up on archivers.co, there is currently no explanation of a privacy and security policy. We should probably have one.

There are probably plenty of starting points. Here's one I wrote for something else, to give us some ideas about the topics that probably should be discussed: http://sbml.org/Facilities/Documentation/Privacy_notice_and_terms_of_service_for_the_Online_SBML_Validator

Also, such things typically need to be vetted by lawyers (or at least that's a requirement whenever I put up an online service at my institution – might be different in this context).

Custom Crawls Chapter

Create a chapter introducing custom crawls on Data Together

Sections:

  1. What is custom crawling?
  • Why do some websites need custom crawls?
  • What should your custom crawler extract from the webpage?
  • Examples of sites needing custom crawlers
  1. Introduction/tutorial for Morph
  • What is morph?
  • Make a Morph acct
  • Getting a Data Together API key, and making sure morph can access it
  1. Tutorial for Archivertools package
  • What does it do?
  • Installing package
  • Using Archiver class
  1. Examples/case studies of custom crawls

cc @ebenp

Tutorial: Nominate a dataset to Data Together

Write a tutorial, modeled on the dweb primer, showing how to add a dataset to Data Together

Things to address in addition to the how:

  • how is this an example of stewarding Data Together?
  • what happens when I nominate a dataset?
  • can I harvest a dataset myself?
  • can my community run their own data together harvester and storage nodes?

Identify audience for "Replicate a dataset"

The "Replicate a dataset" tutorial presents an essential part of the DT platform -- a mechanism that allows an individual or entity to assume direct responsibility fr the health of a dataset or collection.

This is a conceptually important and without it we can't give a complete account of the DT vision. However, the current implementation is difficult to work with, for at least the following reasons:

  • it requires command-line knowledge, something especially rare among Windows users
  • the IPFS install (again, especially on Windows) can be finicky; for this reason it's not a very good introduction to command-line work. While an introduction to the command line can be powerful (cf. Software Carpentry), we are not setting up beginners for success, and their experience may actually lead them to AVOID future contact with CLI
  • guest networks often lock down the IPFS default ports, so the demo may not even work for most people!
  • in future versions of DT, the CLI will not be necessary, as @b5 is building an electron app that will run the IPFS daemon in the background
  • most end users probably don't care about IPFS per se, even if they're interested in learning about distributed data curation, and want to contribute somehow

Proposal: let's keep this tutorial around but only break it out when we're talking to people who are directly concerned with computing infrastructure. This means people like sysadmins, data managers, and maybe digital project archivists & librarians. This audience can really benefit from a more technical introduction.

Meanwhile, for other audiences, let's craft a new tutorial as soon as the app-internal ipfs node is implemented. We can walk through similar tasks and invite participants to start participants to start contributing to the distributed web via DT, and point the enthusiastic to the command-line version for an in-depth look.

Add README and Templates

Make sure this repo has the following files:

  • Readme README.md
    • Repo Badges for: Github Project, Slack, License
    • 1-3 sentence description of repository contents
    • Getting Involved section
    • Development section
  • License -- LICENSE
  • Contributing Guidelines (minimal and pointing to org-wide) .github/CONTRIBUTING.md
  • Issue Template -- .github/ISSUE_TEMPLATE.md
  • GitHub Description from 1-3 sentence readme blurb

This issue forms part of a project-wide meta-issue

Adjust URL for learning materials

Currently the materials are active at:

It doesn't look like adding a CNAME with archivers.co into gh-pages branch helps us maintain the /learning/ path... I'm not too sure but I think we fairly soon want to change where we serve content, and rethink the redirect structure

TODOs to mark as complete:

  • Determine preferred URL (I am thinking like datatogether.org/learning?)
  • Adjust how content is served
  • Clean up existing redirect (cloudflare)
  • Clean up existing content serving (GitHub pages)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.