Comments (17)
I consider this cheatsheet very import because:
- It would help to spread the word about h2o in the R Community, as most R users use RStudio.
- It is important to have concepts clear and summarized.
from h2o-tutorials.
Yes, indeed, you can check out what I've done till now here:
https://github.com/jtelleria/H2O-Cheatsheet
Juan
from h2o-tutorials.
I have already finished editing the h2o R Front-End Cheatsheet:
PDF:
https://github.com/jtelleria/H2O-Cheatsheet/blob/master/H2O%20Cheatsheet%20v2.000.pdf
POWER POINT:
https://github.com/jtelleria/H2O-Cheatsheet/blob/master/H2O%20Cheatsheet%20v2.000.pptx
I will give the final touches next week and submit it to the RStudio Cheatsheet webpage:
https://www.rstudio.com/resources/cheatsheets/
Contribution to polish final details is always welcome.
Kind regards,
Juan Telleria
from h2o-tutorials.
from h2o-tutorials.
- You're right, we don't have those functions properly documented. The only way to get to them is through the
h2o.abs()
version. The functions are identical (and are aliases of each other) so it doesn't matter which one you call in your code. I made a ticket to update the docs. - Yes and they can be used with apply on H2OFrames.
from h2o-tutorials.
Created an Issue on JIRA:
https://0xdata.atlassian.net/browse/PUBDEV-5627
Thank you.
Juan
from h2o-tutorials.
We have this: https://github.com/h2oai/h2o-tutorials/blob/master/training/h2o_algos/h2o_algos_cheat_sheet_04_25_17.pdf
Is there any appetite for adding R and/or Python Snippets to go along with current graphics and pointers?
from h2o-tutorials.
I was thinking more of h2o R package key function summary, such as the ones you can find in:
http://docs.h2o.ai/h2o/latest-stable/h2o-docs/booklets/RBooklet.pdf
https://h2o-release.s3.amazonaws.com/h2o/rel-slater/9/docs-website/h2o-docs/booklets/DeepLearning_Vignette.pdf
@jphall663 The cheatsheet you mentioned explains the concepts on when too use one model or another. It's complement would be a cheatsheet that explains the functions to user for each modeling technique.
I could help myself in such development.
Juan
from h2o-tutorials.
I started to do the h2o R Command Cheatsheet by myself.
If anyone is interested in collaboration, I attach what I have done till now:
H2O Cheatsheet v1.04.pptx
In addition, I have also uploaded the PowerPoint to my R Github account:
https://github.com/jtelleria/H2O-Cheatsheet
from h2o-tutorials.
@jtelleria Are you interested in converting your cheatsheet into the RStudio template? That would be very nice!
from h2o-tutorials.
looks awesome - @ledell have a look!
from h2o-tutorials.
Hi @jtelleria, I am just seeing this now. Thanks for the contribution! There are a few things that we'd like like to edit; here are some things I noticed at first glace:
pg1
- clarify the difference between how
h2o.importFile()
andh2o.uploadFile()
work - clarify the difference between how
h2o.exportFile()
andh2o.downloadCSV()
work - remove
colnames()
as it's a duplicate ofnames()
It would be nice to add two new functions for "data generation", h2o.target_encode_apply
and h2o.target_encode_apply()
.
pg2
- typo in
h2o.group_by()
its missing the underscore - add
h2o.revel()
andh2o.setLevels()
to "factor level manipulations" - date manipulations section is missing the rest of the ops, like
h2o.day()
,h2o.month()
,h2o.hour()
,h2o.dayOfWeek()
- supervised learning section is missing:
h2o.stackedEnsemble()
andh2o.automl()
- unsupervised modeling section is missing:
h2o.glrm()
andh2o.svd()
and could also includeh2o.word2vec()
. - add
h2o.getGrid()
- remove (also typo):
ho2.model metrics
this does not exist... i think you makeh2o.make_metrics()
h2o.performance()
should not be in` Classification model helpers" -- this works for any type of model- would be nice to show how to get the train, valid and xval metrics from H2O models using
h2o.somemetric(model, xval = TRUE)
for example. - add
h2o.scoringHistory()
near modeling functions - add
h2o.varimp
andh2o.varimp_plot()
. - add
h2o.removeAll()
to cluster operations - it's missing
h2o.download_pojo()
andh2o.download_mojo()
which are important because it's how you productionize H2O models. maybe you could change "h2o object serialization" section to "h2o model import/export" and put all of these there. - you don't need to specify
nfolds = -1
inh2o.init()
because that's the default. I'd also remove load balancing to the bottom of the column or remove. i don't think anyone ever uses this.
I think we should probably add h2o.merge()
, h2o.rbind()
and h2o.cbind()
and h2o.arrange()
(sorting) under the data munging section. If there is a way where we can show how to slice rows and columns, that would also be good. All data munging ops are here:
http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-munging.html
Overall, I think there could be more of an emphasis on the machine learning and less of an emphasis on the basic math operations, but this is a great start! Probably to add some of the things I recommended, it will require you to remove some less essential things (maybe from the front page).
Again, thank you for this contribution!!
from h2o-tutorials.
@ledell I started implementing little by little the checklist you included above:
https://github.com/jtelleria/H2O-Cheatsheet/blob/master/H2O%20Cheatsheet%20v2.018.pptx
However, I have 2 doubts with "Methods from Group Generics":
- In h2o R package documentation for MATH(H2O), we cannot see, for example, abs() function, only h2o.abs(). If we call the first over an H2O Parsed Data Frame, would we be invoking the second for being a Group Generic?
- All these "Group Generic Functions" (MATH, MATH2, etc) are vectorized? Or shall be used with apply?
Thank you,
Juan
from h2o-tutorials.
@ledell I noticed that function count() does not exists, but it is documented as if it existed:
https://h2o-release.s3.amazonaws.com/h2o/rel-vajda/1/docs-website/h2o-docs/data-munging/groupby.html
Instead, H2O R users shall use nrow():
h2o.group_by(data = starwars, by = "skin_color", nrow("skin_color"), gb.control=list(na.methods="rm"))
I also noticed that with h2o.group_by() we cannot use the full name of some summary functions, and we shall use instead their "Group Generic" Version. E.g.: Use sum() instead of h2o.sum()
I do think however that it is a good coding practice to use functions named as h2o.*, for emphasizing that we are working over a "H2O Parsed Data Object" explicitly (Even if that occurs through Methods).
from h2o-tutorials.
@ledell I have just finished the Second Version of the H2O Cheatsheet with all the changes you commented me.
You can download it here, in case you want to do any final suggestions before I do a pull request to RStudio Cheatsheet Repository:
https://github.com/jtelleria/H2O-Cheatsheet/raw/master/h2o.pdf
Best,
Juan
from h2o-tutorials.
Done the Pull Request, the updated Cheatsheet can be downloaded at:
https://www.rstudio.com/resources/cheatsheets/
from h2o-tutorials.
Thanks @jtelleria!
from h2o-tutorials.
Related Issues (20)
- Will H2ORuleFit be added to h2o-3 HOT 2
- H2o AutoML unexpected results for model predict
- Is there a preferred method of saving and loading h2o word2vec models in python?
- Update Target Encoding tutorials with new TE API
- PNG not generating in Python3 tutorial on Windows machine HOT 2
- AttributeError: module 'h2o' has no attribute 'init'
- png in generateTreeImage function not working.
- Update AutoML code to fix `metalearner()` method code
- Relevel factor H2OResponseError
- AttributeError: type object 'ModelBase' has no attribute 'metalearner'
- How do I generate Arch features of new datasets from GLRM predict function HOT 4
- h2o-tutorials/tutorials/mojo-resource/ - Could not find or load main class Main HOT 2
- mojo-resource tutorial
- How to calculate the time differences of two date fields?
- Unable to use MapeMetric from CustomMetricFuncRegression HOT 1
- how many instances of the server safe to run
- memory leaks - is this JVM or H2O issue? HOT 1
- [GBM lr_annealing] failed: water.exceptions.H2OIllegalArgumentException: Can only convert jobs producing a single Model or ModelContainer. HOT 2
- Looks like keep_cross_validation_predictions wasn't set when building the models, or the frame was deleted.
- Parsing error
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from h2o-tutorials.