Git Product home page Git Product logo

testingframework's Introduction

Installation | How to use | How to write tests| Code

Google Summer of Code 2019

Mentors

  • Enrico Bocchi
  • Diogo Castro
  • João Vicente
  • Jakub Moscicki
  • Enric Tejedor

Introduction

SWAN (Service for Web-based ANalysis) is a cloud data analysis service developed and powered by CERN that provides Jupyter notebooks on demand. It is based on Jupyter upstream technology but it is deeply integrated with CERN-specific services, e.g., EOS which provides storage to SWAN , CVMFS which is used to retrieve software on the fly. Jupyter notebooks, despite being easily accessible from an intuitive web-based interface, are a complex environment, especially when used together with JupyterHub, custom extensions, external storage backends and computational clusters. This project aims at creating a testing framework for both upstream Jupyter components and SWAN-specific components which will allow the addition of new tests to cover new features of the SWAN service and will be able to run synthetic tests. The testing framework is self contained and includes functional tests as well as performance tests.

This testing framework covers the following components:

Upstream Components :

  1. configurable-http-proxy
  2. JupyterHub API
  3. SQLite database managed by JupyterHub
  4. SWAN docker containers

CERN-specific components :

  1. The EOS filesystem
  2. The CVMFS repositories

Setup Instructions

  • This project assumes a SWAN or ScienceBox setup. ScienceBox contains SWAN and all the other CERN-specific services (i.e., EOS, CERNBox, CVMFS). To install required dependencies:
git clone https://github.com/Divya063/TestingFramework.git
cd TestingFramework
pip3 install -r requirements.txt

How to use

There are two testing modes:

  • From the host machine (default)
  • From containers (sciencebox) - To run the test from user container use the flag -u and pass the session name as --session {name}, e.g- --session user2

To run tests from host machine use the command - python3 run.py --configfile [path of yaml file], while running this command it is assumed that all the parameters (required to run the tests) are provided in the test.yaml file or in some cases(JupyterHub API) necessary configuration has already been done.

Storage

  • To run the test from user container use the command given below (there is no need to provide path argument as it will be already set in test.yaml file.)
    python3 run.py -u --session user2 --test storage --configfile test.yaml

Parameters

    mount_sanity:
      timeout: 8
      mountpoints: ['user/u/user2/']
    mount:
      user_mode : 'sciencebox'
    write:
      fileSize: 1M
      filepath: "eos/user/u/user2/0.txt"
    delete:
      filepath: "eos/user/u/user2/0.txt"
    exists:
      filepath: "eos/user/u/user2/0.txt"
    throughput: 
      fileNumber: 10
      fileSize: 1M
      filepath: "eos/user/u/user2/"
    checksum:
      fileNumber: 10
      fileSize: 1M
      filepath: "eos/user/u/user2/"
  1. mount_sanity:

    • Test file : test_mount_sanity.py
    • Use case : Given a list of mount points checks if a mount point is hanging
    • To run this test explicitly use - python3 test_mount_sanity.py --timeout 5 --mount_points user user/u
  2. mount:

    • Test file : test_mount.py
    • Use case : Checks if eos is mounted on host and sciencebox
    • There are two testing modes:
      • "host"
      • "sciencebox"
    • To run this test explicitly use - python3 test_mount.py --mode host
  3. throughput:

    • Test file : test_throughput.py
    • Use case : Benchmark read-write performance and compute the read and write throughput.
    • To run this test explicity use - python3 test_throughput.py --num 10 --file-size 1M --dest eos/user/u/user2
  4. checksum :

    • Test file : test_checksum.py
    • Use case : Calculates the checksum
    • To run this test explicity use - python3 test_checksum.py --num 10 --file-size 1M --dest eos/user/u/user2

CVMFS

  • To run the test from user container use the command given below.
    python3 run.py -u --session user2 --test CVMFS --configfile test.yaml

Parameters

cvmfs:
    mount:
      repoName: 'sft.cern.ch'
      repoPath: 'cvmfs/sft.cern.ch/'
    ttfb:
      repoPath: 'cvmfs/sft.cern.ch/'
      filePath: 'cvmfs/sft.cern.ch/lcg/lastUpdate'
    throughput:
      num: 2 #Number of packages you want to read
      repoPath: 'cvmfs/sft.cern.ch/'
      filePath: 'cvmfs/sft.cern.ch/lcg/releases/'
  1. mount :

    • Test file : test_mount.py
    • Use case : Checks if cvmfs folder is mounted or not
    • To run this test explicity use - python3 test_mount.py --repo sft.cern.ch --path cvmfs/sft.cern.ch/
  2. ttfb:

    • Test file : test_ttfb.py
    • Use case : Evaluates the time needed to get the first byte (TTFB) of a file known to exist (lastUpdate).
    • To run this test explicity use - python3 test_ttfb.py ---repo sft.cern.ch --path cvmfs/sft.cern.ch/lcg/lastUpdate
  3. throughput:

    • Test file : test_throughput.py
    • Use case : Benchmark performance when reading from the repository and compute the read throughput.
    • To run this test explicity use - python3 test_throughput.py --num 2 --repo_path cvmfs/sft.cern.ch --path cvmfs/sft.cern.ch/lcg/releases/

Jupyterhub API

  • To run the tests from jupyterhub container use python3 run.py -u --session {session-name} --test jupyterhub-api --configfile test.yaml. e.g - python3 run.py -u --session user2 --test jupyterhub-api --configfile test.yaml
  1. Parameters

Parameters for the test is present in test.yaml which is as follows:

     jupyterhub_api:
    check_api:
      hostname: 'localhost'
      port: '443'
      token: ""
      base_path: ""
      verify : False
    token:
      hostname: 'localhost'
      port: '443'
      token: ""
      users: ['user2']
      base_path: ""
      verify: False
    create_session:
      hostname: 'localhost'
      port: '443'
      token: 'b39639d589c44a2294b3dd1164607287'
      users: ['user2']
      base_path: ""
      params:
        TLS: False
        LCG-rel: "LCG_95a"
        platform: "x86_64-centos7-gcc7-opt"
        scriptenv: "none"
        ncores: 2
        memory: 8589934592
        spark-cluster: "none"
      delay: 30 #Max
      verify: False
    check_session:
      hostname: 'localhost'
      port: '443'
      token: ""
      users: ['user2']
      base_path: ""
      verify: False
    stop_session:
      hostname: 'localhost'
      port: '443'
      token: ''
      users: ['user2']
      base_path: ""
      verify: False
  1. check_api

    • Test file : test_check_api.py"
    • Use case : Checks hub's sanity by making a GET request to "https://localhost:443/hub/api/". On successful execution response code should be 200.
    • To run this test explicity use - python3 test_check_api.py --token " " --port 443 --base_path ""
  2. token

    • Test file : test_token.py"
    • Use case : Checks token validity by making a GET request to "https://localhost:443/hub/api/users/user{}". On successful execution response code should be 200.
    • To run this test explicity use - python3 test_token.py --port 443 --token " " --users user1 --base_path ""
  3. create_session

    • Test file : test_create_session.py
    • Use case : Checks if session can be created successfully or not.
    • parameters :
      • params : This data needs to be passed to create a user container
      • timedelay : In the process of creating sessions, first the required user's server is requested which is validated by the response code 202, a server is created consequently, which needs to repond within 30s, otherwise the server will be obliterated. The wait time is maximum 30s, if the server didn't respond within stipulated time, response code 500 will be received.
        Example :
        TimeoutError: Server at http://172.18.0.15:8888/user/user0/ didn't respond in 30 seconds
    • To run the test explicitly use following command: python3 test_create_session.py --port 443 --token " " --users user2 --json '{"LCG-rel": "LCG_95a", "platform": "x86_64-centos7-gcc7-opt", "scriptenv": "none", "ncores": 2, "memory": 8589934592, "spark-cluster": "none"}' --delay 30 --base_path ""

warning

Before creating servers/sessions make sure -

  • You generate the token by running jupyterhub token dummy_admin, add the token inside jupyterhub_config.py file as c.JupyterHub.service_tokens = {'dummy_admin' : '<token_value>'} and restart the jupyterhub process with supervisorctl restart jupyterhub. Also add the token to "token" field present in the yaml file.

  • And you have created users as each server is created for one particular user.

    • Using curl : curl -XPOST -v -k https://localhost:443/hub/api/users/user2 -H "Authorization: token {token from yaml file}"
    • Python code :
        def check_create_users(self):
         """
         Inside container
         port = 443
         Ouside container
         port = 8443
         """
         self.log.write("info", "creating users..")
         print(self.users)
         for user in self.users:
             global r
             try:
                 r = self.session.create_users(user)
             except Exception as err:
                 self.exit |= 1
                 self.log.write("error", str(err))
                 self.log.write("error", str(r))
             else:
                 if r.status_code == 201:
                     self.log.write("info", user + " successfully created")
                 else:
                     self.log.write("error", (r.content).decode('utf-8'))
                     self.exit |= 1
         self.log.write("info", "Exit code " + str(self.exit))
         return self.exit

  1. check_session

    • Test file : test_check_session.py
    • Use case : Checks if a session is running or not
    • To run this test explicitly use -
      • For single user - python3 test_check_session.py --port 443 --token " " --users user1 --base_path ""
      • For multiple users - python3 test_check_session.py --port 443 --token " " --users user0 user1 user2 --base_path ""
  2. stop_session

    • Test file : test_stop_session.py
    • Use case : Checks if a session can be stopped or not
    • To run this test explicitly use -
      • For single user - python3 test_stop_session.py --port 443 --token " " --users user1 --base_path ""
      • For multiple users - python3 test_stop_session.py --port 443 --token " " --users user0 user1 user2 --base_path ""

sqlite database

  • This test checks the consistency of the sqlite database present inside srv/jupyterhub/ as jupyterhub.sqlite.
  • To run this test use python3 run.py -u --session user2 --test database --configfile test.yaml Parameters:
database:
    token:
      path: "jupyterhub.sqlite"
      user: "user2"
      mode: "active" #active mode
      table_name: "api_tokens"
    servers:
      path: "jupyterhub.sqlite"
      user: "user2"
      mode: "active" #active mode
      table_name: "servers"
    spawners:
      path: "jupyterhub.sqlite"
      user: "user2"
      mode: "active" #active mode
      table_name: "spawners"

Two modes are there:

  • active mode - When the server is active
  • delete mode - When the server is removed or deleted
  1. token

    • Test file : test_token.py
    • Use case : Checks the status of "token" table when a session is created or removed.
    • To run this test explicitly use - python3 test_token.py --path jupyterhub.sqlite -d --user user2 --table api_tokens
  2. servers

    • Test file : test_servers.py
    • Use case : Checks the status of "servers" table when a session is created or removed.
    • To run this test explicitly use - python3 test_servers.py --path jupyterhub.sqlite -d --user user2 --table servers
  3. spawners

    • Test file : test_spawners.py
    • Use case : Checks the status of "spawners" table when a session is created or removed.
    • To run this test explicitly use - python3 test_spawners.py --path jupyterhub.sqlite -d --user user2 --table spawners

Container

  • To run this test use python3 run.py --configfile test.yaml (this test is not meant to be run from the containers)

Parameters:

user_docker:
    docker:
      container_name: 'jupyter-user2'
      timeout: 5

Checks if a container is healthy or not.

Future Work

  • Pending Work
    • Filed issues
    • Implement healthchecking mechanism similar to liveliness-probes and readiness-probes

Student

Divya Rani

testingframework's People

Contributors

divya063 avatar ebocchi avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

testingframework's Issues

exit_code functions

These seem to be everywhere but they do not do much...
Could we fold them into the main test function?

!! Make the framework modular wrt yaml content !!

This is high priority.

You should start from the assumption that the yaml file will only contain tests that we want to run, and so their required parameters. If a test exists in the framework but it is not listed in the yaml, it will not be executed.

As discussed at the very beginning, keep a clear structure between the test names in the yaml file and the class names in the codebase. This allows you to dynamically try to import the class with that name and get the results of the test.

From a previous Diogo's comment, some pseudo-code:

    from test_name import test_name.upper_case() as test
    testing = test(tasks[test_name])
    testing.exit_code()

This makes very easy to run the tests we want and add new ones.
In the second case, we should just create the .py file with the matching class.

Deduplicate parameters in tests.yaml

There might be cases where a given parameter is repeated over and over in the yaml file, e.g., the filename to read/write/delete or the port number to reach JupyterHub.
One could deduplicate these parameters by declaring common parameters in the yaml and referencing those in the specific configurations of each test.

Describe requirements to run the framework

It would be good to list in the documentation the packages required to run the framework.
E.g.,

  • docker
  • grafana
  • sqlite3
  • argparse
  • yaml
  • ...

The list is not exhaustive. To be checked.

Duplicated code in storage tests

I see some duplicated code in the storage tests.
For example, write, throughtput, and checksum test need to write to a file. A similar comment is applicable to the read operation.

Consider changing this and have minimal functions accepting the relevant parameters that are doing the job for everyone. IOUtils looks like the logical place to put this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.