goelbiju / visualising-optimisation-data Goto Github PK

View Code? Open in Web Editor NEW

0.0 2.0 0.0 3.81 MB

COMP3000 Final Year Project - Visualising Optimisation Data

JavaScript 17.32% HTML 4.02% CSS 1.92% TypeScript 68.70% Python 8.05%

visualising-optimisation-data optimisation-runs visualisations

visualising-optimisation-data's Introduction

Visualising Optimisation Data

This is the COMP3000 Computing Project on the topic of creating "A Web Platform for Visualising Optimisation Data".

Project Supervisor: Dr. David Walker

Poster

Project Vision

This project aims to develop a “Web Platform for Visualising Optimisation Data”, for evolutionary computation research at the University of Plymouth, to visualise aspects of optimisation data that is generated from evolutionary algorithms.

The web platform will provide:

the ability for the optimisation data collected from client optimisers to be visualised in real-time,
the playback of saved optimisation runs,
a deployed web platform acting as a single tool which collects all appropriate visualisations for optimisation runs.

A demonstration of the project is available on YouTube.

Installation

The following pieces of software need to be installed on your machine in order to set up the local development environment:

NodeJS (https://nodejs.org/en/)
MongoDB (https://www.mongodb.com/try/download/community)
Yarn (installed through NPM from NodeJS, see https://yarnpkg.com/)
Python 3.8+ (https://yarnpkg.com/) (for running the client optimiser scripts); ensure that python and pip is added to environment variables (for Windows) and that it can be called from the command line/terminal

Once the required software has been installed, you can proceed with setting up the codebase and installing packages/dependencies:

Clone the repository from GitHub using the following command:

git clone https://github.com/GoelBiju/Visualising-Optimisation-Data.git

Install the packages required for the frontend and plugins:

yarn install

Install the packages required for the backend by changing the directory into the backend and using the following command:

cd backend && yarn install

Running frontend, backend and plugins

Once all the packages/dependencies have been installed for both the frontend and backend, the stack can be run in the following order:

Start the backend from the root of the project:

yarn backend

Run the frontend from the root using:

yarn frontend

At present there are two plugin examples:

For the Pareto Front plugin, run:

yarn pareto

for the Line Graph plugin, run:

yarn line

In order to feed in data to the database and see real-time visualisation, see the section below to run optimser clients.

Running client scripts

The current examples provided make use of Python 3.8.10.

(Optional) Set up a virtual environment:

python -m venv --system-site-packages .\venv

.\venv\Scripts\activate

To run and test the sample data already provided, install the Python dependencies from the root of the project:

pip install --user -r requirements.txt

The optimiser client scripts are in the "/scripts" folder, you can run from them the root of the project with the following:

Run DTLZ1:

python scripts/dtlz1.py

Run DTLZ2:

python scripts/dtlz2.py

The frontend will show all active optmiser client runs once the script has been started and any completed runs stored in the database.

Development

Creating Plugins

To create a plugin you can copy any of the existing plugins in the packages folder.

To configure the plugin to work with the frontend, you will need to set its name and description through the files in the plugin folder:

webpackConfig.output.library in craco.config.js to the plugin name,
settings.json/deploy-settings.json to add the plugin details (look at examples already provided),
name in package.json of the plugin project,
div id in index.html in public folder of plugin project to identify where the plugin will load,
registerRouteAction in src/App.tsx for the plugin and provide additional plugin details,
pluginName in src/index.tsx,
plugins command in package.json to add the plugin to start with the command,
web dyno start in Procfile for plugin project (if you are using Heroku).

Deploying backend to Heroku

You will need to create a backend application on Heroku and add the GitHub repository to it.

The backend requires you to make use of an external MongoDB service. For this reason you will need to specify a "MONGODB_URI" configuration key in the config vars in Heroku settings for the backend application. You could use MongoDB Atlas and get a URI to access a database on a cluster.

The backend application does not require any further configuration and uses the default "heroku/nodejs" buildpack which will automatically be recognised when the application is first deployed which triggers npm start on its web dyno.

Deploying frontend and plugins to Heroku

The Heroku deployment is not necessary if the project is to be hosted elsewhere, but, if you want to host on Heroku then you will need to make use of the Heroku multi-procfile buildpack.

Create your apps for the frontend and plugins,
Ensure you connect the GitHub repository to the frontend and plugin apps,
For each of the apps you will need to add the multi-procfile buildpack and the node.js buildpack if it is already not present,
In the frontend application add a "PROCFILE" configuration key with the value indicating the location of the frontend Procfile with respect to the repository root (e.g. packages/frontend/Procfile),
In the frontend application add a "REACT_APP_SETTINGS" key with the value of the name of the settings.json which will be used in the deployed environment (i.e. Heroku) e.g. this can be set to the default "deploy-settings.json",
For each plugin application add a "PROCFILE" configuration key and set the value to the location of the Procfile for the plugin (with respect to the project root) e.g. packages/pareto-front/Procfile or packages/line/Procfile

After this has been set up the application should be deployed from the main branch of the repository.

Deployments

All of the components of the system have already been deployed as a demonstration of the service. The following table shows the deployed applications and their respective locations:

Application	Deployed Location
Frontend	http://opt-vis-frontend.herokuapp.com/
Backend	http://opt-vis-backend.herokuapp.com/
Pareto-front (plugin)	https://opt-vis-pareto-front.herokuapp.com/
Line Graph (plugin)	https://opt-vis-line.herokuapp.com/

visualising-optimisation-data's People

Contributors

Watchers

visualising-optimisation-data's Issues

Support OneMax optimisation run

Description:

We need to show that the OneMax optimisation run can be supported on this platform. This means creating a new Python script for the run and also a new plugin to visualise 1D data as a line graph with the fitness values increasing over the generations.

I need to look at the types of data we may need to additionally store e.g. fitness values and make support for storing 1D data.

Acceptance criteria:

Send OneMax optimisation run data,
Store run information correctly and supported 1D/fitness values for generations
Create a line graph visualisation for the data in the frontend as a new plugin.

Run cards should all have same height

Description:

Currently, a run information card on the homepage increases in height when the run title exceeds the max-width of the card. A solution would be to add an overflow to the run title so when it exceeds the width we just add ellipses and cut off the rest of the title. The full title would be displayed on a mouseover.

Acceptance criteria:

The height of all run cards are equal.

Add testing frameworks

Description:

Add the appropriate testing frameworks i.e. Jest, Cypress for unit and usability/integration testing. Set up backend testing frameworks and overall code coverage.

Acceptance criteria:

Add Jest support for frontend,
Add Cypress support for frontend,
Add nyc/mocha for unit testing backend,
Add chai-http for integration testing support on backend,
Add support for testing MongoDB through unit and integration testing,
Add testing support for socket.io,
Add code coverage support (coveralls).

Add dimensionality reduction

Description:

Allow for PCA/MDS (multidimensional scaling); possible in JavaScript or Python.

Conduct design study

Description:

Conduct a design study (need to explore what is required for this).

Add support for three-objective visualisation

Description:

Add support for 3D visualisations. A good way of testing this is through three objective visualisations. This will however be difficult (or not possible) with the default D3.js we have bundled with the frontend.

For this, it may be possible to make use of Plotly.js (https://plotly.com/javascript/) and 3D scatter plots (https://plotly.com/javascript/3d-scatter-plots/). Since this library is originally in JavaScript it will need the correct typings in order to work with TypeScript otherwise the plugin will need to use React with JavaScript instead.

Acceptance criteria:

Backend supports three objective data and stores them correctly,
Three objective plugin to visualise the data (scatter graph) using an appropriate library,
Frontend and plugin work correctly as it should like the other plugins.

Show tooltip for graph data

Description:

As an update, I am currently working on re-organising the content on the visualisation container to show additional run information.

I want the visualisation to also show a D3 tooltip when hovered indicating the point x,y locations, similar to this example:

Acceptance criteria:

Show tooltip over each point in the visualisation with the exact X and Y.

Cache generation data in frontend

Description:

We need to be able to store generation data in the frontend state especially if we require previous generation data for the visualisation. If we change to storing data in the frontend we can just request the generation data we need by giving a list of the generations to the server and then add the data to the state when we receive it.

This avoids receiving the information we already have and means we won't have to make further requests to the server for the same data during replays or when seeking when using the generation slider.

Acceptance criteria:

Store generation data in the frontend after requests are made,
Use the stored data next time it is needed when seeking or replays are made.

Conduct usability testing

Description:

Conduct further usability testing with a greater number of participants.

We are looking to develop a "Visualisation-as-a-Service".

Simplify plugin loading

Description:

Prevent confusion in setting up routes/registerRouteAction and simplify the process

Investigate integration of React Query

Description:

Investigate to see if React Query could be used instead of Redux since it has the ability to cache data.

Store data in MongoDB with Maps

Description:

Make use of MongoDB Maps to store the generation data more easily rather than having one large array.

Acceptance criteria:

Stores populations under a generation object using MongoDB Maps,
Updates the generation number with each write to the database (without other writes to the database interfering).

Remove Heroku files and references

Description:

Heroku announced the removal of their free tier. Remove and archive references to Heroku and deployment instructions.

Acceptance criteria:

All Heroku references and files are archived.

Option to set replay speed

Description:

A user may want the replay to be a lot slower, maybe display 1 generation per second. At the moment the replay is as quick as we can receive and display the data on the frontend.

Acceptance criteria:

Field/slider to set the speed of the visualisation, or with a toggle to turn the speed on or off (if off it will run as quick as we receive the data).

Add Continuous Integration

Add continuous integration using Travis CI and add support for unit testing using jest and support for integration testing using Cypress.

Add state to frontend

Description:

Add state management to the frontend using Redux and support Redux actions which deals with handling WebSockets and HTTP requests.

Acceptance criteria:

Add Redux to frontend
Configure Redux store, actions, reducers and types

Plugins do an initial mount on any URL and not on their respective URLs

Description:

This may not necessarily be an issue but this is due to us loading the plugins initially on a page refresh or load and causing them to mount initially. This does not cause a render of the plugin but shows that the plugin has been mounted and it can prevent the plugin which supposed to be rendered on the page from rendering until all plugins are loaded.

Alert client when data has been processed

Description:

We are currently sending data to the server although the client at the moment does not say when the operation has been complete without manually checking the data in the database. The client needs to know when all the data has been processed.

Acceptance criteria:

Client is notified once all data has been processed by the server
The processing thread stops once it has been complete.

singleSpa errors when loading plugins

Description:

There are two errors when loading a plugin in the frontend. The first is that it loads multiple times on the page (possible due to re-renders?) (https://single-spa.js.org/error/?code=41).

The other error is due to the fact that singleSpa.start() is not called. This is most likely due to the plugin loading on all the pages but we want it to be rendered only when the user visits the data page.

Acceptance criteria:

Plugins are not loaded multiple times on the page,
The plugin is not loaded at the home or any other page except for the data page (this should stop the second error from occurring).

Load settings from a file

Description:

Load settings from a file, this will hold the configured plugins/visualisations.

Acceptance criteria:

Load the settings information from a file
Settings involves the plugins/visualisations

Add parallel coordinate plots

Description:

Support parallel coordinate plots as a plugin in the frontend. An example of drawing parallel co-ordinate plots in D3.js is shown in this example. We can make use of the current DTLZ data for the time being to draw them.

Acceptance criteria:

Parallel co-ordinate plots can be drawn using D3.js,
Visualise DTLZ data using parallel coordinate plots.

Switch to GitHub Actions

Description:

It might be a good idea to move this to use GitHub Actions instead of using TravisCI. There is a training tutorial by GitHub here to learn how to use GitHub Actions.

Replay optimisation run

Description:

Provide the option for the user to replay the optimisation run for a running or complete run. For a running optimisation run, play the optimisation up until the current generation (or until the current data received). All the required data in order to replay the run should be fetched.

Acceptance criteria:

Provide a button to replay the optimisation,
Replay from the start to the current data received,
~~Provide pause/replay options and fetch data accordingly.~~

Slide between generations

Description:

Add a Slider component (similar to that found in Slider Material UI) to show the generations that have been added to the database.

Acceptance criteria:

Able to slide between the generations that have been stored in the database,
The slider updates when receiving new generation information,
Able to scroll back through Pareto front estimations with the plugin loaded,
There should be a pause and replay button alongside the slider (left of the slider),
Controls include textbox and button to enter generation number to view,
Able to enter a generation and click view to load its data,
Ensure that there are no side-effects from using controls e.g. hooks firing multiple times.

Visualisations are provided in real-time

Description:

All visualisations need to be provided in real-time, as a result when optimisation runs are in progress their data will need to be visualised as it comes in. The initial prototype will plot random values to the already generated scatter plot in D3.

We will need to leverage WebSockets (through [Socket.io]) in order to communicate data in real-time and also allowing for two-way communication between data point generation script and server and frontend to server.

Task on planner

Acceptance criteria:

Data is sent from a python script to server
Data can be sent from server to frontend
Frontend visualise data in realtime and updates visualisations as time progresses

Data points stored incorrectly/not enough stored

Description:

When observing adding the sample data points, we can see that not all have been added.

Acceptance criteria:

All data is stored in the database.

Pareto-front visualisation

Description:

Make use of the scatter graph to create a Pareto front estimation graphs using the DTLZ1 and DTLZ2 sample data.

Acceptance criteria:

Shows all the generations data as it comes in
Plots the solutions from each generation on the graph as a Pareto front estimation

Fix Heroku deployment issues

Description:

At the moment the build is failing due to the frontend package installation not finding the frontend-common library to install.

Acceptance criteria:

Deployment to Heroku functions correctly
settings.json and deploy-settings.json is copied into the build folder for the frontend

Deployment options

Description:

Set up a continuous deployment solution. We can make use of Heroku to deploy the React frontend and backend server/API.

Acceptance criteria:

Build and deploy React frontend on Heroku
Deploy the backend to another Heroku to serve the API
Allow communications between the backend and client optimisers

Load external libraries via CDNs in plugins

Description:

It seems any libraries we use via CDN defined in the public/index.html in plugins do not load. Instead, it has to be placed on the front-end in order for it to load. The libraries that a plugin uses need to be able to be used as packages or loaded from CDN in the plugin.

This is not an exact bug since the "index.html" in each plugin is for it to run independently of the frontend in development i.e. this is why we have separate React CDN links in there. However, for libraries like Plotly where the example we are using require a CDN, we may need to use another method to fetch the library.

Create common library

Description:

Create a common library which will hold shared API calls or components.

Acceptance criteria:

A shared common library that can be imported for use by visualisation plugins
Contains functions to API and websocket data
Contains creation of a single shared Redux store

End to end tests failing on TravisCI

Description:

One issue is that the settings JSON files in packages/frontend/public folder is not being found and copied to the build folder. The main error during the end to end test stage of the build on TravisCI is because a timeout occurs waiting for the frontend to start on port 3000.

Acceptance criteria:

Settings files are copied successfully to the build output folder (including e2e-settings.json),
Frontend build is successful and runs locally on port 3000.

Backend running out of memory

Description:

The backend might crash locally or on Heroku when receiving a large volume of data through websockets to update into the database. This seems to be due to memory allocation/it runs out of memory?

Simply sending data one by one is no longer appropriate, there needs to be an alternate approach to send in data.

Acceptance criteria:

The backend does not crash when receiving lots of data.

Support multiple optimisation runs with one client instance

Description:

At the moment after one optimisation run finishes the client does not exist, we need to be able to start a new optimisation run after one completes and use createRun and need to clear the queue after a run finishes.

Acceptance criteria:

Be able to do additional runs after one is complete.

Visualisations are provided as plugins

Description:

Make use of single-spa to serve visualisations as separate "plugins" to frontend.

Acceptance criteria:

Created plugins are separate to frontend
Plugins loaded into frontend
Plugins make use of common data available in frontend

Save optimisation data

Description:
The data which is received from the Python script needs to be saved into a MongoDB.

Acceptance criteria:

The data sent identifies an optimisation run (i.e. title, date/time, types of graphs to render on) and is saved with a run ID
The optimisation data itself is stored in the database (for this we are using 2D data with only x,y co-ordinates).

Change method of sending/improve sending time

Description:

The server running out of memory was due to the concurrent writes to the MongoDB and the saves would accumulate a lot of memory until the server runs out of heap memory to allocate.

We need to either change the way we send (currently in batches with the size according to the population size) or improve the time it takes to send all the data to the server.

Acceptance criteria:

Improve the time it takes to send the data
The server does not run out of memory or crash due to writing to the database.

react-router links fails to load plugin

Description:

When changing the location to the plugin page (/runs/:runId/visualisations/:visualisationName/data) the plugin fails to load. This is not the case when visiting directly. This could be the fact that the plugin will load directly onto the page upon loading but will not load when the "div" is just inserting into the DOM when navigating to the new page.

We need to work out a way to manually refresh the page or the component or trigger a re-render on the whole plugin.

Acceptance criteria:

The plugin displays when navigating to it from another page in the website.

Add Code Coverage Reports

This can be done with CodeCov when needed.

Update run information when client is disconnected

Description:

When a client is disconnected during a run, the run information e.g. the total generations, completed status, needs to be updated otherwise we see the slider provide the option to scroll to generations that have not been received.

Acceptance criteria:

Slider does not show generations that have not been received,
Total generations updated with the actual received when the client is disconnected,
Run status is changed from completed to stopped (or similar).

Plugin re-renders cause vertical scrollbar to appear intermittently

Description:

Viewing at 100% on Chrome (on a 15-inch laptop screen) is causing the optimisation run page to move left and right due to the vertical scrollbar appearing every time the plugin is re-rendered on new data.

Acceptance criteria:

Prevent vertical scrollbar appearing on plugin data re-render.

List of visualisations

Description:

The frontend needs to show a list of the visualisations available for an optimization run. This will allow the user to select it and open the graph from which can they can then stream the real-time data to the frontend or playback a completed optmisation run.

Acceptance criteria:

List of graphs supported by the optimiser
Click on the graph brings up the graph with the relevant data in real-time or playback of a completed run.

D3 charts re-render on top of each other

The graphs re-render and appear on top of each other (visible when you see multiple x,y axis labels) when parameters are changed. Any currently rendered graphs and data need to be cleared before they are rendered again in the useEffect.

Fetch saved visualisation data

Description:

When connecting to an ongoing optimisation run, visualisation data that has already been saved can be fetched and sent to the frontend and then the real-time data can be streamed.

Acceptance criteria:

Data fetched from server based on generations

List of optimisation runs

Description:

Show a list of the currently running client optimisers and the completed optimiser on the landing page.

Acceptance criteria:

API returns all the optimisation runs (active and completed)
There a list of active optimisation runs
There is a list of completed optimisation runs.

Load plugins for frontend

Description:

Allow for a "plugin"/child application to work with the frontend (parent). SciGateway is being used as a reference and some of the single-spa related code is being used.

Acceptance criteria:

The frontend can load plugin applications which have been built and served,
The plugin information is loaded from the plugin settings/configuration,
Plugins can be served on custom routes.

goelbiju / visualising-optimisation-data Goto Github PK

visualising-optimisation-data's Introduction

Visualising Optimisation Data

Poster

Project Vision

Installation

Running frontend, backend and plugins

Running client scripts

Development

Creating Plugins

Deploying backend to Heroku

Deploying frontend and plugins to Heroku

Deployments

visualising-optimisation-data's People

Contributors

Watchers

visualising-optimisation-data's Issues

Recommend Projects

Recommend Topics

Recommend Org