Git Product home page Git Product logo

smarteda's Introduction

Use Google Colab with R Kernel to perform Exploratory Data Analysis and Principal Components Analysis

Overview

This repository contains Jupyter notebooks configured to run R within Google Colab, a platform typically associated with Python. The notebooks demonstrate how to use the SmartEDA and Data Explorer packages for data analysis in R, in conjunction with the 'tidyverse' suite of packages. This setup allows users to utilize Google Colab's resources without needing a local R installation.

Accessing and Running the Notebook

  • Navigate to the Notebook: Locate the Jupyter Notebook file (.ipynb) in this repository.

  • Open in Colab: Click the "Open in Colab" badge at the top of the file to launch the notebook in Google Colab.

  • Change Runtime to R: To ensure the notebook runs with an R kernel:

  1. In Colab, go to the Runtime menu.
  2. Select Change runtime type.
  3. Choose R from the dropdown under Runtime type.
  4. Click Save.
  • Verify the R Environment: The first line of code print(R.version.string) confirms the R setup. Run this cell to display the version of R in use.

print(R.version.string)

  • Using Tidyverse: Directly load the pre-installed tidyverse package in the notebook:

library(tidyverse)

  • Install SmartEDA: Follow the notebook instructions to install and load the SmartEDA package:

install.packages("SmartEDA") library(SmartEDA)

Automated Exploratory Data Analysis with SmartEDA

SmartEDA is an R package designed to simplify the exploratory data analysis, ideal for quickly understanding the structure and relationships in datasets. It efficiently handles both numeric and categorical predictors and remains useful even in the absence of categorical variables.

SmartEDA automates various aspects of data exploration, including descriptive statistics, information value analysis, and the creation of custom tables and graphical representations. This makes it an invaluable tool for data scientists and analysts looking to save time and gain immediate insights into their data.

Features Demonstrated in the Notebook

  • Basic data exploration with SmartEDA.
  • Generating descriptive statistics.
  • Creating custom tables and visualizations.
  • Utilizing tidyverse for data manipulation and visualization.
  • Generating HTML reports.

Prerequisites

No local installation of R, SmartEDA, or tidyverse is required, as everything runs in the cloud on Google Colab. However, a Google account is necessary to access Google Colab.

Note

The notebook does not cover the weight of evidence (WoE) analysis. Users interested in this aspect might need to perform additional steps.

License

  • This project is licensed under the MIT License. For more details, see the LICENSE file in this repository.

  • Dataset License The dataset used in this project, "Global Country Information Dataset 2023," is maintained by Kaggle and is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). For more information about this license, please visit Creative Commons License CC BY 4.0.

smarteda's People

Contributors

blocklys avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.