datasciencemasters / go Goto Github PK
View Code? Open in Web Editor NEWThe Open Source Data Science Masters
Home Page: datasciencemasters.org
License: The Unlicense
The Open Source Data Science Masters
Home Page: datasciencemasters.org
License: The Unlicense
I'm glad to finally find real-life application of my math skill
It's unfortunate that a project touting "ebooks... [that are] all free and open" links almost exclusively to for-purchase texts using Amazon affiliate linking, rather than actually linking to the many free and open texts out there.
I like that you note at the bottom of this that others should contribute. Perhaps we can expand upon that a little further, and add a Contributing.md file with instructions?
I'm curious about whether contributions are preferred in a certain format, say forking the repo, then making a change, and submitting it as a pull request? Or is it more open than that?
More importantly, what kind of contributions are welcome or needed? Are there areas that have been identified as needing further development? I don't necessarily agree with all of the items here, and I have come across my own books and courses that I think are better suited, but is this purely a big list that we add to, and not replace or substitute existing items?
Thanks for putting this repo together, it is very valuable and I refer to it often :)
In readme.md the link in "Datasets are now here", ie. bit.ly/osdsm-datasets leads to https://github.com/datasciencemasters/go/edit/master/datasets.md when it should lead to https://github.com/datasciencemasters/go/blob/master/datasets.md
I get emails from people thanking me for the OSDSM. Many also ask what to do next, or what career they can choose after studying pieces of the curriculum.
Let's open the conversation:
I have gone through this curriculum and found it very very helpful. I really appreciate your effort of gathering all these for everyone.
Looks like a lot to cover, but I believe I can scale through it all. Everything seems very clear from the middle; mostly after the Maths section. My main issue is on getting started. On the first part (start here), are you to pick one of the 3 options to go on? Or you go through all the courses there and focus on the particular topics provided? Or everything generally?
Same thing applies to the Maths sections. Pick one or two books or go through them all?
A clear description or recommendation on how to go about it could be of better help, so as to focus on the important things and not just beating around the bush.
The computing section is very clear as there is an average of about 2 resources to go through on each aspect of computing. Just the intro seems a little confusing, as each one of them offer almost the same thing.
This is a wonderful place to look for resources about data science education.
I really hope that the author is still active because some links are broken in README. There are quite a number of pull requests that has not been well-addressed, and I wonder if I should contribute to this repo.
Maybe the Stanford MOOC (https://class.stanford.edu/courses/HumanitiesScience/StatLearning/Winter2014/) should be added to the list element which mentions the book "The Elements Of Statistical Learning".
Hi, there is a dead link in README.md for Differential Equations in Data Science "Python Tutorial".
It takes the user to an online jupyter notebook link with a 404 Error.
Include this resource: http://bactra.org/notebooks/causality.html
(Full disclosure: I was one of the creators of each of these courses)
This is a class that we taught at MIT called "From ASCII to Answers: Advanced Topics in Data Processing". The course website is http://db.csail.mit.edu/6.885/, but the entirety of the lectures and 8 labs are available on github: https://github.com/mitdbg/asciiclass/
Another class that Eugene Wu and I taught on Data Literacy: http://dataiap.github.io/dataiap/. Content is here: https://github.com/dataiap/dataiap. This class was more introductory, and gives students six three-hour labs to walk them through data cleaning/visualization/statistics/text processing/mapreduce in Python.
If these can be of any help here, let us know!
The Harvard videos won't work for me at all--they give error messages on trying to watch. The slides are still ok, but the videos don't work. Are there system specific issues with this? I was only able to test them on OS X.
This is a charming example and lab based book on data mining for beginners. It was useful to me in the past.
Harvard Intro to DataScience Video Lectures link is broken. Can you change with this link?
In README.md, the link in "How To Hire A Data Scientist" (i.e., bit.ly/howtohireadatascientist) fails to load. The article can be found here: https://brightemployers.wordpress.com/2012/11/13/how-to-hire-a-data-scientist/
Hi Clare and all.
As I mentioned against another issue - I think that we could do a better job in facilitating team capstone projects. We would just need a system for proposing a project / raising a request for joining one.
Perhaps this could be done through the wiki or via issues?
Any other thoughts?
Hi!
I just went through the README and noticed that the Link "A Software Engineer's Guide to Getting Started with Data Science" under Resources / Read on the bottom is dead.
I feel like any training resources provided by Google is worth mentioning.
https://developers.google.com/machine-learning/crash-course/
Thank you for your consideration!
-Tommy
https://spacy.io/
This link can also be added under Natural Language Processing & Understanding in readme.
A #Markdown Table of Contents with permalinks would be helpful.
I was working my way through the first course which was brilliant! It was an excellent level for beginners but now the link points to a new intermediate course which is unfortunately not free.
Add data science cheat sheets. Such as:
https://www.datacamp.com/community/data-science-cheatsheets
The link to the resources for Linear Regression in the machine-learning.md file appears to be broken as it redirects to a 404 not found page.
Hi!
These two links are really recommended libraries for graph processing as you already mentioned NetworkX. We worked with both of them with graphs over 5M nodes without any problem.
Graph tool (fast and efficient Python library):
https://graph-tool.skewed.de/
iGraph (with many algorithms implemented, and available in C, python and R):
http://igraph.org/
Regards
Hi, Udacity (free online course) has some upcoming data science courses too. Perhaps add in an 'Upcoming Courses' section?
Here's the list:
The link to the Statistics course by Princeton on Coursera is broken. The course is not available anymore.
Git repo of Data Science Specialization at Coursera
https://github.com/DataScienceSpecialization/courses
The lecture slides for Coursera's Data Analysis class
https://github.com/jtleek/dataanalysis
Introduction to Applied Data Analysis for Social Science
https://github.com/christophergandrud/Introduction_to_Statistics_and_Data_Analysis_Yonsei
When entering the website there is an insecure site message.
@clarecorthell I wanted to recommend reviewing the online Data Science courses University of Wisconsin does and to add any that my be appropriate!
http://datasciencedegree.wisconsin.edu/data-science-program/data-science-courses/
Hi!
I'm a third-year university student majoring in neuroscience with a focus on the pathophysiology of neurodegenerative diseases.
I've recently begun teaching myself python and have gotten fairly comfortable with it - and I would love to contribute to datasciencemasters as it looks like something I would be interested working on!
Only issue is, I've never contributed to anything on github - I know enough git to help myself, but am a fair beginner outside of that.
Could someone point me in the right direction as to a) what needs work and b) in what manner?
General advice is also more than welcome!
Cheers =)
This one starts in a few days and I didn't see it listed in your document (which is awesome btw).
From Stanford:
http://online.stanford.edu/Mining_Massive_Datasets_Fall_2014
The Analytics Edge - https://www.edx.org/course/mitx/mitx-15-071x-analytics-edge-1416
Networks, Crowds, and Markets - https://www.edx.org/course/cornellx/cornellx-info2040x-networks-crowds-1354
Of course, these classes have not started yet, but they seem to be promising (one of the instructors of the Networks course is Jon Kleinberg).
Hi,
Instead of list all the python, R, other packages to install, why not create a bunch of scripts (dotfiles) that install all the packages once in your system (or instead in a Vagrant Box/ python virtualenv) [1]. I think it is a good idea for newbies.
See for example this two repos
Update: https://yhathq.com/products/sciencebox (instructions at https://docs.yhathq.com/sb/setup)
PS : I am currently doing this on my dotfile repo.
Light-weight Python framework and OLAP HTTP server for easy development of reporting applications and aggregate browsing of multi-dimensionally modeled data.
The Intro to Data Science (https://www.coursera.org/specializations/data-science) course now costs EUR41/month
Hey, since LPTHW warns on using python 3, the python reddit removed it from the sidebar (https://www.reddit.com/r/Python/comments/40s6dm/meta_can_we_take_learn_python_the_hard_way_off/) maybe we should also remove it from the list until a python 3 version is available.
We should also not that the jupyter notebook server will only support python 3 in the future (http://blog.jupyter.org/2016/07/08/ipython-5-0-released/)
Error > This site canβt be reached
On the home page of http://datasciencemasters.org/ there is a quote followed by a citation
David Hardtke How To Hire A Data Scientist 13 Nov 2012
And that citation has a bitly link embedded in it...
http://bit.ly/howtohireadatascientist
...and that bitly link resolves to...
http://blog.bright.com/2012/11/13/how-to-hire-a-data-scientist/
..which is broken.
(FWIW I see that this link was removed entirely from the README.md page)
The link is not lost, however, and the new link to that same article is here:
https://brightemployers.wordpress.com/2012/11/13/how-to-hire-a-data-scientist/
I'd recommend updating the README.md page, and also the datasciencemasters.org page so that the quote points to the new link, but the less attractive alternative would be to remove the broken link from datasciencemasters.org
This link for 'Differential Equations in Data Science' under the Math section of the website throws a 404 error, looks like the notebook was deleted from the original repository.
Thanks for putting this together! 3 quick thoughts on helping people find cool data to get started with:
Another category could be "dataset newsletters", as Jeremy Singer-Vine's weekly newsletter features new ones every week. http://tinyletter.com/data-is-plural/archive .
Potentially add this resource for visualizing graphs https://python-graph-gallery.com/
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.