The learning from datatogether

Advance Packet for 4S Workshop Attendees

Prepare advance packet for workshop attendees and send it out.

Draft of Pre-materials to send out:

Join Slack, get it on your computer / phone
Visit datatogether.org?
Read EDGI Blog post on “introducing data together”?
Be aware that there will be a followup google form request about what data you rely on, and who you reached out to about it, and what you talked about.
Fill out RSVP survey (see #2)

Supplies to bring:

people could bring their own hardware? if so, what requirements?

reach out to boston-area libraries people who can help run the 4S workshop

Explain the value of signing up

The gitbook lesson we're putting together suggests that people can optionally register on Data Together, but doesn't explain why they would want to do that. What are the benefits of signing up?

Pre-Workshop Survey for Workshop Attendees

Post a survey for workshop attendees to fill out in advance. When ready, send out an email inviting attendees to fill it out.

Questions:
What datasets do you rely on? keywords + 2 or 3 URLs

Tutorial: Replicate a dataset you care about onto hardware that you control

Write a tutorial, based on the style of the dweb primer, showing how to replicate a dataset.

General steps

Install ipfs
Make sure ipfs is installed and working properly
get the hash of the data
Make sure you have enough storage space to hold the data
pin the data onto your machine.

Follow-up info:

Pinning rings, ipfs-cluster, and pinning services
A recommendation for libraries to run ipfs nodes and to treat pin sets as part of their collections, and to treat nomination of datasets for harvest/replication as part of collection development activities.

Add a Glossary Section

Adding a dataset means [...]
Harvesting a dataset means [...]
Storing a dataset means [...]
Public Record means [...]
Data Together Nodes means [...]
Distributed Data Stewardship means [...]

In the short term we should just add these definitions inline into #17 in steps 4 & 5.

Design icebreaker / interactive performance

Design icebreaker / interactive performance around an abstract concept like "where the data goes," "distributed hashtables," or other.

Takeaway for participants: ability to see your own role in a larger ecosystem where people hold the data that they care about rather than relying on current institutions and structures

prior art by @dcwalk : https://github.com/dcwalk/performingmesh

identify hands-on track leaders

Identify people to be "station" leads for the hands-on portion of the workshop.

Must make sure we have people who can help participants use command line on Windows

using the command line (must be prepared to support windows, mac and linux)
installing IPFS
finding the hash you want to replicate and making sure you have enough storage
pinning the data on you ipfs node
playing with the pinned copy of the data

Note: We know that some people won't get past station 1. That's ok. Getting exposure to the command line in an encouraging, supportive environment is an extremely valuable and empowering learning experience.

Workaround needed for situation where IPFS ports are locked down on network

On guest networks, the IPFS ports are likely to be locked down. This makes "Replicate a Dataset" hard to do. Should we figure out a workaround?

Tutorial: Browse datasets that have been backed up

Write a tutorial, based on the style of the dweb primer, showing how to browse through the backed up datsets and how to make sense of the information you see.

This tutorial will need to be updated as the tools and UI evolve...

Privacy of user information on Data Together

If someone signs up on archivers.co, there is currently no explanation of a privacy and security policy. We should probably have one.

There are probably plenty of starting points. Here's one I wrote for something else, to give us some ideas about the topics that probably should be discussed: http://sbml.org/Facilities/Documentation/Privacy_notice_and_terms_of_service_for_the_Online_SBML_Validator

Also, such things typically need to be vetted by lawyers (or at least that's a requirement whenever I put up an online service at my institution – might be different in this context).

Custom Crawls Chapter

Create a chapter introducing custom crawls on Data Together

Sections:

What is custom crawling?

Why do some websites need custom crawls?
What should your custom crawler extract from the webpage?
Examples of sites needing custom crawlers

Introduction/tutorial for Morph

What is morph?
Make a Morph acct
Getting a Data Together API key, and making sure morph can access it

Tutorial for Archivertools package

What does it do?
Installing package
Using Archiver class

Examples/case studies of custom crawls

cc @ebenp

Invite Boston-based collaborators

We know there are some Boston-based Data Together people (cc @jeffreyliu) or those that are interested in participating/learning more (@titaniumbones).

We should reach out directly about attending the event

Tutorial: Nominate a dataset to Data Together

Write a tutorial, modeled on the dweb primer, showing how to add a dataset to Data Together

Things to address in addition to the how:

how is this an example of stewarding Data Together?
what happens when I nominate a dataset?
can I harvest a dataset myself?
can my community run their own data together harvester and storage nodes?

Identify audience for "Replicate a dataset"

The "Replicate a dataset" tutorial presents an essential part of the DT platform -- a mechanism that allows an individual or entity to assume direct responsibility fr the health of a dataset or collection.

This is a conceptually important and without it we can't give a complete account of the DT vision. However, the current implementation is difficult to work with, for at least the following reasons:

it requires command-line knowledge, something especially rare among Windows users
the IPFS install (again, especially on Windows) can be finicky; for this reason it's not a very good introduction to command-line work. While an introduction to the command line can be powerful (cf. Software Carpentry), we are not setting up beginners for success, and their experience may actually lead them to AVOID future contact with CLI
guest networks often lock down the IPFS default ports, so the demo may not even work for most people!
in future versions of DT, the CLI will not be necessary, as @b5 is building an electron app that will run the IPFS daemon in the background
most end users probably don't care about IPFS per se, even if they're interested in learning about distributed data curation, and want to contribute somehow

Proposal: let's keep this tutorial around but only break it out when we're talking to people who are directly concerned with computing infrastructure. This means people like sysadmins, data managers, and maybe digital project archivists & librarians. This audience can really benefit from a more technical introduction.

Meanwhile, for other audiences, let's craft a new tutorial as soon as the app-internal ipfs node is implemented. We can walk through similar tasks and invite participants to start participants to start contributing to the distributed web via DT, and point the enthusiastic to the command-line version for an in-depth look.

Add README and Templates

Make sure this repo has the following files:

Readme README.md
- Repo Badges for: Github Project, Slack, License
- 1-3 sentence description of repository contents
- Getting Involved section
- Development section
License -- LICENSE
Contributing Guidelines (minimal and pointing to org-wide) .github/CONTRIBUTING.md
Issue Template -- .github/ISSUE_TEMPLATE.md
GitHub Description from 1-3 sentence readme blurb

This issue forms part of a project-wide meta-issue

Pre-Harvest URLs identified in RSVP responses for Workshop

Pre-Harvest URLs identified in RSVP responses, so people have something to find & replicate in the tutorials.

Adjust URL for learning materials

Currently the materials are active at:

It doesn't look like adding a CNAME with archivers.co into gh-pages branch helps us maintain the /learning/ path... I'm not too sure but I think we fairly soon want to change where we serve content, and rethink the redirect structure

TODOs to mark as complete:

Determine preferred URL (I am thinking like datatogether.org/learning?)
Adjust how content is served
Clean up existing redirect (cloudflare)
Clean up existing content serving (GitHub pages)

make gh issues for 4s event

Copy action items and deliverables from https://docs.google.com/document/d/1AMkWFA-4I6y1_IpG_heVxqaDMFB3YxPDuZ966NzXcsA/edit# and assign to people
Set up a waffle board

datatogether / learning Goto Github PK

learning's Introduction

Data Together

Data Together empowers people to create a decentralized civic layer for the web, leveraging community, trust, and shared interest to steward data they care about.

Organizational structure

Quarterly Calls

Working Openly

License

learning's People

Contributors

Stargazers

Watchers

Forkers

learning's Issues

Recommend Projects

Recommend Topics

Recommend Org