breck7 / ohayo Goto Github PK
View Code? Open in Web Editor NEWA free, fast, public domain data science studio
Home Page: https://ohayo.breckyunits.com
A free, fast, public domain data science studio
Home Page: https://ohayo.breckyunits.com
title Ohayo * Ohayo is a fast and free tool for data science. Ohayo consists of a very high level programming language and a visual web studio for that language. The goal of Ohayo is to enable people to do data science at the speed of voice. You can see a short clip of Ohayo in action here. https://youtube.com/watch?v=qqyGHmUlKoc here * You can try ohayo at https://ohayo.breckyunits.com, download Ohayo on GitHub, try Ohayo hosted on GitHub, or install it using `npm install ohayo`. https://ohayo.breckyunits.com https://ohayo.breckyunits.com https://github.com/breck7/ohayo GitHub https://github.ohayo.breckyunits.com GitHub image slides.gif # Key Concepts ## OhayoLang * Ohayo the language is a Tree Language, built using TreeNotation. Ohayo is a dataflow language. https://github.com/breck7/ohayo/tree/main/ohayo language https://treenotation.org/ TreeNotation ## Scripts * OhayoLang is a scripting language like any other and you can write programs in it by hand or using the Ohayo Studio. OhayoLang scripts generally have the file extension `.ohayo`. ## Tiles * An Ohayo program is composed of Tiles. Tiles can display UI to the user. Tiles are recursive and can be the parent of other tiles. Tiles are namespaced and all must contain at least one `.`. ## Tile Properties * Tiles can define and use their own Properties. The names of Tile Properties cannot contain a `.`. ## DataTables * All Tiles can access the tables of their ancestor tiles and also pass on a new table to their descendants. The data tables currently use the jTable library. https://github.com/breck7/jtree/tree/main/jtable jTable ## Common Tile Types * All Tiles extend from a base class. The three most common core Tile Types are Provider, Transformer, and Chart. In data science you have 3 main kinds of things: datasets, data transformations, and visualizations. Datasets include everything from weather forecasts to emails to business transactions. There are millions of possible datasets. In Ohayo tiles that provide datasets generally extend the Provider base tile type. Transformations are things like filtering, grouping, and joining. In Ohayo tiles that transform data generally extend the Transformer tile type. Charts include bar charts, line charts, scatterplots and word clouds. In Ohayo charts generally extend the Chart base tile type. ## Creating Tiles * If you need a new tile—to add a new user friendly data source or visualization type, for example—you can implement it using TypeScript/Javascript/Grammar language. See the packages folder for examples. Documentation for this will come out later in 2020. https://github.com/breck7/ohayo/tree/main/ohayo/packages packages # BETA! * Ohayo is still beta and iterating frequently. Post feedback here or on the TreeNotation subreddit. Ohayo hopefully will be stable by July 2023. https://www.reddit.com/r/treenotation/ subreddit # Marketing Jumbo * If you are looking for some marketing-speak, here you go: orderedList 1. The simplest syntax possible. No parentheses, no brackets, no semicolons. Just words you can speak. 2. Write by hand or program visually. The first visual editor that generates perfectly clean code. 3. Autocomplete like you've never seen before. AI powered autocomplete that keeps getting better. 4. Free and open source. The price is $0, and extensions and collaboration are welcome. 5. No installing required. Run Ohayo instantly in your browser, even on your mobile device. 6. No tracking, no cookies. Ohayo doesn't track users, use cookies, or store your data. 7. Secure by design. Your data stays on your machines, we never see it. 8. Runs anywhere. Run it from our sites, host it yourself, or run it on your local machine. # Other Tools For Data Scientists * Ohayo is just one of my tools that are trying to make data science easier. Here's a list of related products: pipeTable Name|NameLink|Year|Wikipedia|WikipediaLink Rows.com|https://rows.com/|2020|| Explo.co|https://explo.co/|2020|| Arquero|https://github.com/uwdata/arquero|2020|| Basedash|https://www.basedash.com/|2019|| Grid Studio|https://github.com/ricklamers/gridstudio|2019|| Workbench|https://workbenchdata.com/|2018|| ActionDesk|https://www.actiondesk.io/|2018|| Data Illustrator|https://data-illustrator.com/|2018|| Observable|https://observablehq.com/|2017|| Idyll|https://idyll-lang.org/|2017|| VisiData|https://www.visidata.org/|2017|| Google Data Studio|https://datastudio.google.com/overview|2016|W|https://de.wikipedia.org/wiki/Google_Data_Studio Flourish|https://flourish.studio/|2016|| Tidyverse|https://www.tidyverse.org/|2016|W|https://en.wikipedia.org/wiki/Tidyverse Vega Editor|https://vega.github.io/editor/|2015|| Amazon QuickSight|https://aws.amazon.com/quicksight/|2015|| GapMinder Vizabi|https://vizabi.org/|2015|| Toucan|https://toucantoco.com/en/|2015|| xsv|https://github.com/BurntSushi/xsv|2014|| metabase|https://www.metabase.com/|2014|| dplyr|https://dplyr.tidyverse.org/|2014|| JupyterLab|https://github.com/jupyterlab/jupyterlab|2014|W|https://en.wikipedia.org/wiki/Project_Jupyter OmniSci|https://www.omnisci.com/|2013|W|https://en.wikipedia.org/wiki/OmniSci xlwings|https://www.xlwings.org/|2013|| redash|https://redash.io/|2013|| RAWGraphs|https://github.com/rawgraphs/raw|2013|| DataBricks|https://databricks.com/|2013|W|https://en.wikipedia.org/wiki/Databricks Quadrigram|https://www.quadrigram.com/|2012|| Snowflake|https://www.snowflake.com/|2012|W|https://en.wikipedia.org/wiki/Snowflake_Inc. Julia|https://julialang.org/|2012|W|https://en.wikipedia.org/wiki/Julia_(programming_language) Looker|https://looker.com/|2012|W|https://en.wikipedia.org/wiki/Looker_(company) AirTable|https://airtable.com/|2012|W|https://en.wikipedia.org/wiki/Airtable Anaconda|https://www.anaconda.com/|2012|W|https://en.wikipedia.org/wiki/Anaconda_(Python_distribution) Plotly|https://plot.ly/|2012|W|https://en.wikipedia.org/wiki/Plotly DataWrapper|https://www.datawrapper.de/|2012|| ThoughtSpot|https://www.thoughtspot.com/|2012|W|https://en.wikipedia.org/wiki/ThoughtSpot Infogram|https://infogram.com/|2012|W|https://en.wikipedia.org/wiki/Infogram RStudio|https://www.rstudio.com/|2011|W|https://en.wikipedia.org/wiki/RStudio Microsoft SandDance|https://github.com/microsoft/SandDance|2011|W|https://en.wikipedia.org/wiki/Microsoft_Garage Microsoft PowerBI|https://powerbi.microsoft.com/en-us/|2011|W|https://en.wikipedia.org/wiki/Power_BI d3|https://d3js.org/|2011|W|https://en.wikipedia.org/wiki/D3.js piktochart|https://piktochart.com/|2011|W|https://en.wikipedia.org/wiki/Piktochart Google Kaggle|https://www.kaggle.com/|2010|W|https://en.wikipedia.org/wiki/Kaggle ChartIO|https://chartio.com/|2010|| Google BigQuery|https://cloud.google.com/bigquery/|2010|W|https://en.wikipedia.org/wiki/BigQuery OpenRefine|https://github.com/OpenRefine/OpenRefine|2010|W|https://en.wikipedia.org/wiki/OpenRefine Zoho Analytics|https://www.zoho.com/analytics/|2009|| Wolfram Alpha|https://www.wolframalpha.com/|2009|W|https://en.wikipedia.org/wiki/Wolfram_Alpha HighCharts|https://www.highcharts.com/|2009|W|https://en.wikipedia.org/wiki/Highcharts LucidChart|https://www.lucidchart.com/|2008|W|https://en.wikipedia.org/wiki/Lucidchart Pandas|https://pandas.pydata.org/|2008|W|https://en.wikipedia.org/wiki/Pandas_(software Apple Numbers|https://www.apple.com/numbers/|2007|W|https://en.wikipedia.org/wiki/Numbers_(spreadsheet) scikit-learn|https://scikit-learn.org/stable/|2007|W|https://en.wikipedia.org/wiki/Scikit-learn Smartsheet|https://www.smartsheet.com/|2006|W|https://en.wikipedia.org/wiki/Smartsheet Google Sheets|https://www.google.com/sheets/about/|2006|W|https://en.wikipedia.org/wiki/Google_Sheets Alteryx|https://www.alteryx.com/|2006|W|https://en.wikipedia.org/wiki/Alteryx RapidMiner|https://rapidminer.com/|2006|W|https://en.wikipedia.org/wiki/RapidMiner Sisense|https://www.sisense.com/|2004|W|https://en.wikipedia.org/wiki/Sisense KNIME|https://www.knime.com/|2004|| Matplotlib|https://matplotlib.org/|2003|W|https://en.wikipedia.org/wiki/Matplotlib Tableau|https://www.tableau.com/|2003|W|https://en.wikipedia.org/wiki/Tableau_Software Visual-Paradigm Chart Maker|https://online.visual-paradigm.com/features/chart-maker/pyramid-chart-maker/|2002|W|https://en.wikipedia.org/wiki/Visual_Paradigm NumPy|https://www.numpy.org/|1995|W|https://en.wikipedia.org/wiki/NumPy Qlik|https://www.qlik.com/|1993|W|https://en.wikipedia.org/wiki/Qlik JMP|https://www.jmp.com/|1989|W|https://en.wikipedia.org/wiki/JMP_(statistical_software) Mathematica|https://www.wolfram.com/mathematica/|1988|W|https://en.wikipedia.org/wiki/Wolfram_Mathematica Microsoft Excel|https://products.office.com/en-us/excel|1987|W|https://en.wikipedia.org/wiki/Microsoft_Excel MATLAB|https://mathworks.com/products/matlab|1984|W|https://en.wikipedia.org/wiki/MATLAB SAS|https://www.sas.com/|1976|W|https://en.wikipedia.org/wiki/SAS_language SPSS|https://www.ibm.com/us-en/marketplace/spss-statistics|1968|W|https://en.wikipedia.org/wiki/SPSS # How to Give Feedback * Open an issue here, the Tree Notation subreddit or email [email protected]. https://www.reddit.com/r/treenotation/ subreddit # ❤️ Public Domain ❤️ import settings.scroll
There's a cool new python word cloud package. Would be neat to implement this an ohayo tile(s) to replace current wordcloud.
Would it be useful to make tile(s) for this:
https://cloud.google.com/bigquery/public-data/
Should have a show static tile where you can just plugin a hard coded number.
If you just want to drop a couple of columns, and keep most of them, would be nice to have the inverse of columns.keepy
Let's add a spotlight tree component that shows available commands and has autocomplete...Like sublime text...Especially as number of commands grows this would be helpful...
Could be a generic reusable tree component in jtree
Trying to add the Gutenberg tile (#24), I realize for most interesting public datasets even if you just keep the minimum in CSV it's still going to be a huge download which means huge parse time as well (at least 3MB uncompressed min for this particular set).
As the saying goes "easier to move the compute to the data than the other way around".
I wonder if there's a simple way to turn any CSV into a web service? Perhaps using BurntSushi's awesome xsv library or some equivalent in Go.
Would be great to have access to all project gutenberg books right in ohayo
Would love tiles that provide suggestions (potentially can even choose best autocorrect method and implement that) for correcting missing data.
https://cran.r-project.org/web/views/MissingData.html
Perhaps the flow might be something like:
web.get somefilewithmissingdata.csv
table.basic
missing.fix
table.basic
Would be nice to be able to compile OhayoLang programs to:
[] when running tests should we check to see if fabServer is started? perhaps start it for having the web.get test pass?
[] have tests for server methods for desktop version. perhaps create a temp project directory?
Would be cool to have a tile(s) for this:
Should it be called Flow? Forest?
perhaps some tiles from this project:
in particular the 3.5m book dataset (with 1m full text) on google big query.
Would be nice to remove as many dependencies as possible from the core.
These are good for starters:
breck7/jtree#99
breck7/jtree#100
We should also be able to remove a number of others.
Dependencies in package folders are fine and not something to worry about.
The link https://github.com/breck7/ohayo/blob/master/2017-06-21-show-hn-programming-is-now-two-dimensional.html points to a 404.
Lots of wikipedia tiles would be great
Would be neat to have some tiles for this:
https://aws.amazon.com/blogs/aws/aws-data-exchange-find-subscribe-to-and-use-data-products/
A sample program might look something like this:
html.h1 Amazon Data Exchange Providers
amazon.dataExchange.providers
table.list
Would be cool to have a tile for this:
https://github.com/awesomedata/awesome-public-datasets
Perhaps the best way to deal with breaking changes to Maia is:
Idea: scan the program. If you encounter a "hidden", then default program tile is visible. If you encounter a "visible", then the default is hidden.
The autocomplete is pretty dumb right now. Let's move some stuff from the prototypes over to make it smarter.
Add an easy command to export/import all data from localStorage to faciliate easy switching between servers and backups
experiment with sibling data tile flow
instead of:
samples.portal
vega.bar
should we allow:
samples.portal
vega.bar
Could potentially use newlines as sort of a "clear".
Of what the rows are.
The vega tiles are obviously critical. Would be good to get more help with them.
For meta studies, it seems like it would be cool to import multiple maia programs. Perhaps it's best to do this as a build script, and create a simple meta language for it?
So, out of curiosity, how is this different from things like TreeSheets, or "sweet syntax" or various lisps that use deeper tree notations (think SRFI-101-based lisps) or things like Lush or Klone? YAML and Rebol seem to be similar predecessors…
I'm just curious if I missed something obvious, but the papers didn't demonstrate anything that made these wholly unique compared to other previous attempts.
Thanks!
Add a better auto height layout that just uses browser if possible for computing height.
Datasette has some great datasettes. https://github.com/simonw/datasette
Also might make more sense to use that for this: http://datasets.ohayo.computer/
Or just improve experience/design so we need less scrolling?
This is a neat project and there are perhaps templates + plugins to share:
Would be cool to have a tile(s) for this:
let's rigorously explore and define the "." qualifier and how it will work going forward, and how contributions will work
experiment with a "doc.use samples vega" syntax for shortening tile names?
experiment with short dictation dialect
At the moment FlowTiles is the main focus, but could lead to code improvements and technical clarity I think if we had some basic editing scenarios for tiles files.
Imagine having every academic paper ever, at your fingers in Ohayo.
Might be fun to have wayback machine tiles and do things like:
wayback.get http://nytimes.com 1999 2019
wordCloud
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.