Git Product home page Git Product logo

Comments (17)

jtelleria avatar jtelleria commented on June 25, 2024 1

I consider this cheatsheet very import because:

  • It would help to spread the word about h2o in the R Community, as most R users use RStudio.
  • It is important to have concepts clear and summarized.

from h2o-tutorials.

jtelleria avatar jtelleria commented on June 25, 2024 1

Yes, indeed, you can check out what I've done till now here:
https://github.com/jtelleria/H2O-Cheatsheet

Juan

from h2o-tutorials.

jtelleria avatar jtelleria commented on June 25, 2024 1

I have already finished editing the h2o R Front-End Cheatsheet:

PDF:
https://github.com/jtelleria/H2O-Cheatsheet/blob/master/H2O%20Cheatsheet%20v2.000.pdf

POWER POINT:
https://github.com/jtelleria/H2O-Cheatsheet/blob/master/H2O%20Cheatsheet%20v2.000.pptx

I will give the final touches next week and submit it to the RStudio Cheatsheet webpage:

https://www.rstudio.com/resources/cheatsheets/

Contribution to polish final details is always welcome.

Kind regards,
Juan Telleria

from h2o-tutorials.

jtelleriar avatar jtelleriar commented on June 25, 2024 1

from h2o-tutorials.

ledell avatar ledell commented on June 25, 2024 1

@jtelleria

  • You're right, we don't have those functions properly documented. The only way to get to them is through the h2o.abs() version. The functions are identical (and are aliases of each other) so it doesn't matter which one you call in your code. I made a ticket to update the docs.
  • Yes and they can be used with apply on H2OFrames.

from h2o-tutorials.

jtelleria avatar jtelleria commented on June 25, 2024 1

Created an Issue on JIRA:
https://0xdata.atlassian.net/browse/PUBDEV-5627

Thank you.
Juan

from h2o-tutorials.

jphall663 avatar jphall663 commented on June 25, 2024

We have this: https://github.com/h2oai/h2o-tutorials/blob/master/training/h2o_algos/h2o_algos_cheat_sheet_04_25_17.pdf

Is there any appetite for adding R and/or Python Snippets to go along with current graphics and pointers?

from h2o-tutorials.

jtelleriar avatar jtelleriar commented on June 25, 2024

I was thinking more of h2o R package key function summary, such as the ones you can find in:
http://docs.h2o.ai/h2o/latest-stable/h2o-docs/booklets/RBooklet.pdf
https://h2o-release.s3.amazonaws.com/h2o/rel-slater/9/docs-website/h2o-docs/booklets/DeepLearning_Vignette.pdf

@jphall663 The cheatsheet you mentioned explains the concepts on when too use one model or another. It's complement would be a cheatsheet that explains the functions to user for each modeling technique.

I could help myself in such development.

Juan

from h2o-tutorials.

jtelleria avatar jtelleria commented on June 25, 2024

I started to do the h2o R Command Cheatsheet by myself.
If anyone is interested in collaboration, I attach what I have done till now:
H2O Cheatsheet v1.04.pptx

In addition, I have also uploaded the PowerPoint to my R Github account:
https://github.com/jtelleria/H2O-Cheatsheet

from h2o-tutorials.

ledell avatar ledell commented on June 25, 2024

@jtelleria Are you interested in converting your cheatsheet into the RStudio template? That would be very nice!

from h2o-tutorials.

jphall663 avatar jphall663 commented on June 25, 2024

looks awesome - @ledell have a look!

from h2o-tutorials.

ledell avatar ledell commented on June 25, 2024

Hi @jtelleria, I am just seeing this now. Thanks for the contribution! There are a few things that we'd like like to edit; here are some things I noticed at first glace:

pg1

  • clarify the difference between how h2o.importFile() and h2o.uploadFile() work
  • clarify the difference between how h2o.exportFile() and h2o.downloadCSV() work
  • remove colnames() as it's a duplicate of names()

It would be nice to add two new functions for "data generation", h2o.target_encode_apply and h2o.target_encode_apply().

pg2

  • typo in h2o.group_by() its missing the underscore
  • add h2o.revel() and h2o.setLevels() to "factor level manipulations"
  • date manipulations section is missing the rest of the ops, like h2o.day(), h2o.month(), h2o.hour(), h2o.dayOfWeek()
  • supervised learning section is missing: h2o.stackedEnsemble() and h2o.automl()
  • unsupervised modeling section is missing: h2o.glrm() and h2o.svd() and could also include h2o.word2vec().
  • add h2o.getGrid()
  • remove (also typo): ho2.model metrics this does not exist... i think you make h2o.make_metrics()
  • h2o.performance() should not be in` Classification model helpers" -- this works for any type of model
  • would be nice to show how to get the train, valid and xval metrics from H2O models using h2o.somemetric(model, xval = TRUE) for example.
  • add h2o.scoringHistory() near modeling functions
  • add h2o.varimp and h2o.varimp_plot().
  • add h2o.removeAll() to cluster operations
  • it's missing h2o.download_pojo() and h2o.download_mojo() which are important because it's how you productionize H2O models. maybe you could change "h2o object serialization" section to "h2o model import/export" and put all of these there.
  • you don't need to specify nfolds = -1 in h2o.init() because that's the default. I'd also remove load balancing to the bottom of the column or remove. i don't think anyone ever uses this.

I think we should probably add h2o.merge(), h2o.rbind() and h2o.cbind() and h2o.arrange() (sorting) under the data munging section. If there is a way where we can show how to slice rows and columns, that would also be good. All data munging ops are here:
http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-munging.html

Overall, I think there could be more of an emphasis on the machine learning and less of an emphasis on the basic math operations, but this is a great start! Probably to add some of the things I recommended, it will require you to remove some less essential things (maybe from the front page).

Again, thank you for this contribution!!

from h2o-tutorials.

jtelleria avatar jtelleria commented on June 25, 2024

@ledell I started implementing little by little the checklist you included above:
https://github.com/jtelleria/H2O-Cheatsheet/blob/master/H2O%20Cheatsheet%20v2.018.pptx

However, I have 2 doubts with "Methods from Group Generics":

  • In h2o R package documentation for MATH(H2O), we cannot see, for example, abs() function, only h2o.abs(). If we call the first over an H2O Parsed Data Frame, would we be invoking the second for being a Group Generic?
  • All these "Group Generic Functions" (MATH, MATH2, etc) are vectorized? Or shall be used with apply?

Thank you,
Juan

from h2o-tutorials.

jtelleria avatar jtelleria commented on June 25, 2024

@ledell I noticed that function count() does not exists, but it is documented as if it existed:

https://h2o-release.s3.amazonaws.com/h2o/rel-vajda/1/docs-website/h2o-docs/data-munging/groupby.html

Instead, H2O R users shall use nrow():

h2o.group_by(data = starwars, by = "skin_color", nrow("skin_color"), gb.control=list(na.methods="rm"))

I also noticed that with h2o.group_by() we cannot use the full name of some summary functions, and we shall use instead their "Group Generic" Version. E.g.: Use sum() instead of h2o.sum()

I do think however that it is a good coding practice to use functions named as h2o.*, for emphasizing that we are working over a "H2O Parsed Data Object" explicitly (Even if that occurs through Methods).

from h2o-tutorials.

jtelleria avatar jtelleria commented on June 25, 2024

@ledell I have just finished the Second Version of the H2O Cheatsheet with all the changes you commented me.

You can download it here, in case you want to do any final suggestions before I do a pull request to RStudio Cheatsheet Repository:

https://github.com/jtelleria/H2O-Cheatsheet/raw/master/h2o.pdf

Best,
Juan

from h2o-tutorials.

jtelleria avatar jtelleria commented on June 25, 2024

Done the Pull Request, the updated Cheatsheet can be downloaded at:

https://www.rstudio.com/resources/cheatsheets/

from h2o-tutorials.

ledell avatar ledell commented on June 25, 2024

Thanks @jtelleria!

from h2o-tutorials.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.