Git Product home page Git Product logo

mock-spacestation's Introduction

mock-spacestation

What is mock-spacestation?

mock-spacestation is a Bicep template that deploys a Mock Spacestation and Mock Groundstation to Azure to enable developers and enthusiasts to develop and test their own workloads for space with similar constraints to those seen working with the International Space Station (ISS).

The Mock Groundstation and Mock Spacestation virtual machines deployed by this template are how the Azure Space team developed and tested their experiment workload while preparing for the installation of the Hewlett Packard Enterprise (HPE) Spaceborne Computer 2 (SBC2) aboard the ISS.

Deploy to Azure

For context, here's a video summary of that experiment executed in August of 2021:

Video overview of the Azure and HPE Genomics experiment on the International Space Station

What it simulates

  1. Latency

    The Mock Groundstation is located in East US and the Mock Spacestation is located in Australia to simulate the speed of light latency and many international hops that communication with the ISS traverses.

  2. Bandwidth

    The Mock Spacestation is configured out of the box to synchronize with the Mock Groundstation at the actual bandwidth cap when communicating with the ISS: 2 megabits per second.

  3. Processing at The Edge and "Bursting Down" to The Cloud

    When the Azure Space team performed their genomics experiment, they used computing power of the HPE SBC2 on-board the ISS to perform intensive work at the edge to determine what is important enough to send back to Earth, then transmitted just those important bits through the narrow 2 megabit per second pipe, then scaled up analysis and compute on a global scale with Azure.

Get started with mock-spacestation

To get started developing your workload for space:

  1. First, you'll deploy the Mock Spacestation template

  2. Then, you'll execute a small script to get the ssh commands to connect to your Mock Spacestation and Mock Groundstation and see the /trials/ directory synched between the two with all the bandwidth and latency configured into the deployment

You'll need the Azure CLI and the ability to invoke a BASH script to retrieve the SSH key to connect to the Mock Spacestation and Mock Groundstation. If you're on a host that doesn't have those things, or you're not quite sure, you can pretty quickly and easily use our developer environment.

Deploy the Template

You have two options for deploying mock-spacestation:

  1. A command-line deployment via the Azure CLI

  2. A user-interface deployment via the Azure Portal

via Azure CLI

If you're comfortable with the command line, the Azure CLI provides the deployment command group to deploy the Mock Spacestation and Mock Groundstation.

  1. First, ensure you're logged into the Azure CLI and have set the subscription you want to deploy into:

    az login
    az account set --subscription <subscription name or ID> 
    

    Here's a link to the documentation if you need more help logging in: https://docs.microsoft.com/en-us/cli/azure/reference-index?view=azure-cli-latest#az_login

  2. Next, set yourself some environment variables to make things easier resourceGroupName and deploymentName:

    resourceGroupName="mock-spacestation"
    deploymentName="mock-spacestation-deploy"
  3. Then, create a resource group with az group create:

    az group create \
      --location eastus \
      --name $resourceGroupName
  4. And then you can deploy the Mock Spacestation and Mock Groundstation into that resource group with az deployment group create:

    az deployment group create \
      --resource-group $resourceGroupName \
      --name $deploymentName \
      --template-file ./mockSpacestation.json

    Note: Azure Portal deployment supports overriding the default Groundstation and Spacestation location(s) via the UI. When using the CLI, you can override the default location(s) using a --parameters argument to override the default groundstationLocation and/or spacestationLocation parameter(s). For example:

    groundstationLocation="usgovvirginia"
    spacestationLocation="usgovarizona"
    
    az deployment group create \
      --resource-group $resourceGroupName \
      --name $deploymentName \
      --parameters groundstationLocation=$groundstationLocation \
      --parameters spacestationLocation=$spacestationLocation \
      --template-file ./mockSpacestation.json
  5. Once that's complete move on to Connect to the VMs

via Azure Portal

We can deploy the Mock Spacestation and Mock Groundstation to Azure from the portal with just a few clicks.

When you deploy with the "Deploy to Azure" button below, create yourself a new resource group:

Deploying the mock-spacestation template from the Azure Portal

Make note of the name of the Resource Group you create and the name of the Deployment that gets generated for you. You'll need those to get your SSH credentials.

(The generated name is usually something similar to "Microsoft.Template-${timestamp}" like "Microsoft.Template-20210820123456")

The Deployment UI in the Azure Portal showing the Deployment Name

  1. Deploy mock-spacestation into a new resource group:

    Deploy to Azure

  2. Once that's complete, move on to Connect to the VMs

Connect to the VMs

After you've deployed the Mock Spacestation template, use ./getConnections.sh to get connected to the Mock Groundstation and Mock Spacestation.

  1. Invoke getConnections.sh and pass it the name of your resource group and the name of the deployment:

    ./getConnections.sh $resourceGroupName $deploymentName
  2. getConnections.sh will place the private key on your machine and present you with your SSH commands to the Groundstation and Spacestation:

    INFO: Success! Private key written to ./mockSpacestationPrivateKey. Run these commands to SSH into your machines...
    ssh -i mockSpacestationPrivateKey azureuser@mockgroundstation-abcd1234efgh5.eastus.cloudapp.azure.com
    ssh -i mockSpacestationPrivateKey azureuser@mockspacestation-abcd1234efgh5.australiaeast.cloudapp.azure.com
    

Synchronize Directories

Once you're connected to the Spacestation, any files or directories that make their way to the /home/azureuser/trials directory will be synched to the same directory on the Groundstation at a rate of 2 megabits per second every minute.

This scheduled synchronization recreates the time delay and limited bandwidth environment of a real-world experiment executed on the ISS.

The Mock Groundstation and Mock Spacestation /trials directories in sync

  1. Place a file or directory in /home/azureuser/trials:

    # on the Mock Spacestation
    echo "Hello! It is currently $(date) on the mockSpacestation! Happy Hacking!" >> /home/azureuser/trials/hello.txt
  2. And within a minute or so, on the Mock Groundstation, you should see that file in the same directory:

    # now on the Mock Groundstation
    cd /home/azureuser/trials
    cat hello.txt
    Hello! It is currently Fri Aug 20 21:10:10 UTC 2021 on the mockSpacestation! Happy Hacking!
  3. On the Mock Spacestation, you can inspect the contents of azure-sync.log to see file and directory transmission history and transfer speeds:

    # back on the Mock Spacestation
    cat /home/azureuser/azure-sync.log
  4. Which yields output from rsync operations like:

    sent 177 bytes  received 66 bytes  44.18 bytes/sec
    total size is 92  speedup is 0.38
    opening connection using: ssh -i /home/azureuser/.ssh/mockSpacestationPrivateKey -l azureuser mockgroundstation-abcd1234efgh5.eastus.cloudapp.azure.com rsync --server -vvlogDtprze.iLsfxC --bwlimit=250 . /home/azureuser/trials  (12 args)
    sending incremental file list
    delta-transmission enabled
    hello.txt is uptodate
    total: matches=0  hash_hits=0  false_alarms=0 data=0
    

Happy hacking! Continue reading on for more information about how we built the Genomics experiment on Azure using the HPE SBC2 and the ISS, or how we setup our developer machines with containers to collaborate.

An Example "Burst Down" Workload

The Azure Space team's genomics experiment is an example of a solution you could build with these mock-spacestation components:

The Azure Space and HPE Spaceborne Computer 2 Genmoics Experiment Architecture]

More technical information on the experiment can be found at this blog post: https://azure.microsoft.com/en-us/blog/genomics-testing-on-the-iss-with-hpe-spaceborne-computer2-and-azure/

On the Spacestation

  • A Linux container hosts a Python workload, which is packaged with data representing mutated DNA fragments and wild-type (meaning normal or non-mutated) human DNA segments. There are 80 lines of Python code, with a 30-line bash script to execute the experiment.

  • The Python workload generates a configurable amount of DNA sequences (mimicking gene sequencer reads, about 70 nucleotides long) from the mutated DNA fragment.

  • The Python workload uses awk and grep to compare generated reads against the wild-type human genome segments.

  • If a perfect match cannot be found for a read, it’s assumed to be a potential mutation and is compressed into an output folder on the Spaceborne Computer-2 network-attached storage device. After the Python workload completes, the compressed output folder is sent to the HPE ground station on Earth via rsync.

On Earth

  • The HPE ground station uploads the data it receives to Azure, writing it to Azure Blob Storage through azcopy.

  • An event-driven, serverless function written in Python and hosted in Azure Functions monitors Blob Storage, retrieving newly received data and sending it to the Microsoft Genomics service via its REST API.

  • The Microsoft Genomics service, hosted on Azure, invokes a gene sequencing pipeline to “align” each read and determine where, how well, and how unambiguously it matches the full reference human genome. (The Microsoft Genomics service is a cloud implementation of the open-source Burroughs-Wheeler Aligner and Genome Analysis Toolkit, which Microsoft tuned for the cloud.)

  • Aligned reads are written back to Blob Storage in Variant Call Format (VCF), a standard for describing variations from a reference genome.

  • A second serverless function hosted in Azure Functions retrieves the VCF records, using the determined location of each mutation to query the dbSNP database hosted by the National Institute of Health—as needed to determine the clinical significance of the mutation—and writes that information to a JSON file in Blob Storage.

  • Power BI retrieves the data containing clinical significance of the mutated genes from Blob Storage and displays it in an easily explorable format.

Using our Development Environment

Whether you're on Windows or Linux or otherwise, it's pretty handy to use a container described in the repository as your development environment.

Our environment comes with all the tools we used to author this repo so it's where we can best ensure compatibility (plus, we just think it's pretty cool to have our developer machines ready to go with all the tools we need in seconds).

It's really easy to get started with GitHub Codespaces and/or Visual Studio Code.

GitHub Codespaces

If you or your organization uses Codespaces, it's remarkably easy to use our development environment. Just click the green Code icon on the main page of this repository and select New Codespace:

Launching the dev environment with GitHub Codespaces

What are Codespaces? Get more information here: https://docs.github.com/en/codespaces

Visual Studio Code Remote - Containers

It's also easy to use our development environment with Visual Studio Code and the Remote - Containers extension:

Launching the dev environment with Visual Studio Code

What is the Visual Studio Code Remote - Containers extension? Get installation steps and more information here: https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers

Manually connecting to the Spacestation and Groundstation

You can also get configured to SSH into the Spacestation and Groundstation manually: docs/manually-get-ssh-key.md

mock-spacestation's People

Contributors

bigtallcampbell avatar dave-read avatar glennmusa avatar microsoft-github-operations[bot] avatar microsoftopensource avatar pgc1a avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

mock-spacestation's Issues

Permission denied experienced when following the docs

Description
Permissions issue when following docs to create the sync file

Steps to Reproduce
2) creating the hello.txt led to permission denied, it was already there and syncing properly as far as I can tell... azureuser@mockSpacestation:~$ echo "Hello! It is currently $(date) on the mockSpacestation! Happy Hacking!" >> /home/azureuser/trials/hello.txt
-bash: /home/azureuser/trials/hello.txt: Permission denied;

Expected behavior
Follow docs without encountering errors

Actual behavior
N/A

Additional context
Let's add a recursive flag to the chown (so it'd be chown -R azureuser /home/azureuser/trials) and that should take care of both the permission denied and allow the user to overwrite the value if they follow the docs verbatim. Let's keep something in there by default so that we know the crontab registered. Make that change here: https://github.com/Azure/mock-spacestation/blob/main/scripts/configureSource.sh#L17

Update the source and destination directory naming convention to "toX" and "fromX"

Is your feature request related to a problem? Please describe.

Users report getting confused about synching directories like /groundstation and /spacestation and not knowing which directory they're accessing on their respective groundstation or spacestation.

Describe the solution you'd like

Instead, they'd like to have:

  • /toSpacestation on the groundstation synchronize with /fromGroundstation on the spacestation
  • and /toGroundstation on the spacestation synchronize with fromSpacestation on the groundstation

Describe alternatives you've considered

@bigtallcampbell has a version of the rsync command working in this fashion here: https://github.com/bigtallcampbell/mock-spacestation/blob/main/.devcontainer/setupScripts/deployGroundStation.sh#L233-L243

Additional context
n/a

Fix shellcheck errors and warnings

Description

The scripts at /library-scripts are functional but do throw shellcheck warnings and errors.

We should fix these as best as within reason.

Steps to Reproduce
Steps to reproduce the behavior:

  1. Open the repository with the devcontainer
  2. The .sh files should light up yellow with warnings

Expected behavior

A clean repository that follows best practices as much as possible.

Actual behavior

The scripts at /library-scripts are functional but do throw shellcheck warnings and errors.

Add documentation to login for Azure CLI

Is your feature request related to a problem? Please describe.
New users may need help logging in with Azure CLI

Describe the solution you'd like
Update the Readme.md with instructions on login

Describe alternatives you've considered
N/A

Additional context
N/A

Always getting Broken pipe (32) on rsync

Description

Every cron'd rsync invocation results in this message:

rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1668) [Receiver=3.1.2]
rsync: [Receiver] write error: Broken pipe (32)

Files will synchronize successfully, but users could definitely be led to thinking something is wrong.

Steps to Reproduce
Steps to reproduce the behavior:

  1. put a file in /groundstation
  2. wait a minute
  3. check the contents of logs/spacestation-sync.log

Expected behavior

groundstation to spacestation synchronization without errors

Actual behavior

the error always occurs

Additional context

n/a

Run the groundstation and spacestation as local containers without VS Code

Is your feature request related to a problem? Please describe.

For users familiar with Visual Studio Code and Remote Dev Containers, the local environment works well. If I don't have all the tools I need to do that (WSL2, Docker, VSC), however, I'd like to still run experiments.

Describe the solution you'd like

A single BASH script I can execute that:

  • provisions the groundstation and spacestation as Docker containers
  • peers them on a virtual network
  • configures and starts rsync
  • drops me into the groundstation
  • and provides me with a SSH connection to the spacestation

Describe alternatives you've considered

@bigtallcampbell has a version of this working here https://github.com/bigtallcampbell/mock-spacestation/blob/main/.devcontainer/setupScripts/deployGroundStation.sh

Additional context

n/a

[Doc Improvement] Include properties for groundstationLocation and spacestationLocation in CLI setup instructions

Is your feature request related to a problem? Please describe.

I needed to deploy the sample to other clouds/regions that the defaults. I could see in the portal screen shots that those values are parameters, but I didn't see them in the CLI instructions/example. It was not difficult to find the parameters in the bicep template, but I think it would not be uncommon to need to deploy to other clouds/regions so listing those in the setup steps for the CLI would be helpful.

Describe the solution you'd like

Include the two location properties in the CLI example for cases where default location are not available/appropriate, or just note the names.

export groundstationLocation=usgovvirginia
export spacestationLocation=usgovarizona

az group create \
  --location $groundstationLocation \
  --name $resourceGroupName

az deployment group create \
--resource-group $resourceGroupName \
--name $deploymentName \
--parameters groundstationLocation=$groundstationLocation \
--parameters spacestationLocation=$spacestationLocation \
--template-file ./mockSpacestation.bicep

Describe alternatives you've considered
I created a local deployment script as noted above.

Additional context

The existing docs/examples are a very good start. You might want to add a specific issue category for doc related issues/suggestions.

If you agree this is a reasonable change I'm happy to create a PR for suggested change.

Deployment error when directly deploying the bicep file (mockSpacestation.bicep)

Description

My understanding is that with current versions of the CLI you can directly deploy bicep templates. So I expected I could use the CLI to directly deploy mockSpacestation.bicep

Steps to Reproduce
Steps to reproduce the behavior:

  1. Run CLI with:
az deployment group create \
--resource-group $resourceGroupName \
--name $deploymentName \
--parameters groundstationLocation=$groundstationLocation \
--parameters spacestationLocation=$spacestationLocation \
--template-file ./mockSpacestation.bicep

Expected behavior
Same resources would be deployed as when referencing the ARM template:

az deployment group create \
--resource-group $resourceGroupName \
--name $deploymentName \
--parameters groundstationLocation=$groundstationLocation \
--parameters spacestationLocation=$spacestationLocation \
--template-file ./mockSpacestation.json

Actual behavior
I see error that seem to be related to parsing the script files for delivery to the VM agent.

{"status":"Failed","error":{"code":"DeploymentFailed","message":"At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.","details":[{"code":"Conflict","message":"{\r\n  \"status\": \"Failed\",\r\n  \"error\": {\r\n    \"code\": \"ResourceDeploymentFailure\",\r\n    \"message\": \"The resource operation completed with terminal provisioning state 'Failed'.\",\r\n    \"details\": [\r\n      {\r\n        \"code\": \"DeploymentFailed\",\r\n        \"message\": \"At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.\",\r\n        \"details\": [\r\n          {\r\n            \"code\": \"Conflict\",\r\n            \"message\": \"{\\r\\n  \\\"status\\\": \\\"Failed\\\",\\r\\n  \\\"error\\\": {\\r\\n    \\\"code\\\": \\\"ResourceDeploymentFailure\\\",\\r\\n    \\\"message\\\": \\\"The resource operation completed with terminal provisioning state 'Failed'.\\\",\\r\\n    \\\"details\\\": [\\r\\n      {\\r\\n        \\\"code\\\": \\\"VMExtensionProvisioningError\\\",\\r\\n        \\\"message\\\": \\\"VM has reported a failure when processing extension 'configureDestination'. Error message: \\\\\\\"Enable failed: failed to execute command: command terminated with exit status=2\\\\n[stdout]\\\\n\\\\n[stderr]\\\\ncript/download/0/script.sh: line 19: z9YXmCiWDghUlhb08pwNak0sOi2KZ8DbZ/T09hBb3LSH4C30XVPasCxawM9cFIVP7dvkRN: No such file or directory\\\\n/var/lib/waagent/custom-script/download/0/script.sh: 

Then at the end notes that unexpected EOF, so maybe there's something about line endings?

line 44: unexpected EOF while looking for matching `''\\\\n/var/lib/waagent/custom-script/download/0/script.sh: line 91: syntax error: unexpected end of file\\\\n\\\\\\\"\\\\r\\\\n\\\\r\\\\nMore information on troubleshooting is available at https://aka.ms/VMExtensionCSELinuxTroubleshoot \\\"\\r\\n      }\\r\\n    ]\\r\\n  }\\r\\n}\"\r\n 

Additional context
Deploying the ARM template worked fine

Enable 2-way groundstation <=> spacestation sync

Is your feature request related to a problem? Please describe.
Right now, only the spacestation syncs down to the groundstation. It'd be nice if I could sync in both directions.

Describe the solution you'd like
Add another rsync cron job that syncs groundstation to spacestation and break up "trials" into "to-spacestation" and "from-spacestation" and "to-groundstation" and "from-groundstation"

PowerShell support for getConnections

Is your feature request related to a problem? Please describe.

It'd be nice if there were a PowerShell complement to the shell script getConnections.sh so that I can get the private key and SSH connection commands from a Windows machine without any additional configuration.

Describe the solution you'd like

getConnections.ps1 that takes the same arguments and performs the same tasks as getConnections.sh and documentation in the root README that I can one-click copy/paste into a terminal to execute it.

Describe alternatives you've considered

Document in the repository how to use Git BASH from a Windows machine until this is ready

Additional context
n/a

Deployment to Private Network

Deployment of mock spacestation script deploys two vm instances with azure public IP's. Currently using a subscription behind an expressroute. Would like to use customer owned networking in deployment.

Can the script be updated to include deployment using a custom private IP address from existing vnet?

keep ARM json up-to-date by building Bicep on successful merges to main

Is your feature request related to a problem? Please describe.

We have to remember to manually build the ./mockSpacestation.bicep file whenever we change the implementation so that the compiled ./mockSpacestation.json file is available for Portal deployment and for users without Bicep.

That's silly. We should automate that.

Describe the solution you'd like

Whenever something is approved to merge to main, we should build ./mockSpacestation.bicep so that ./mockSpacestation.json always reflects what's in the repo.

Describe alternatives you've considered

Remembering to manually build and inspect ./mockSpacestation.json for validity before pushing up a PR.

Additional context

n/a

Remove the SSH Key generation deployment script once mock-spacestation deployment no longer needs it

Is your feature request related to a problem? Please describe.
Today, after I deploy mock-spacestation, I have to execute getConnections.sh to remove the SSH keygen deployment script and its results from the resource group.

Describe the solution you'd like
It'd be swell if the deployment were smart enough to remove this script and its results after the KeyVault has been seeded and the Virtual Machines have been created with the key values written into them.

Describe alternatives you've considered
We could use a Managed Identity with the appropriate Resource Group scope role assignment to execute a Deployment Script that deletes the SSH keygen script resource.

Update the README with verbose steps

Is your feature request related to a problem? Please describe.

An experienced Visual Studio Code or Azure user can deploy mock-spacestation, but it's more complicated than following the README verbatim.

We should expect developers of all skill levels to open the repository in a browser, follow some instructions, copy and paste some commands, and see the spacestation and groundstation synchronizing.

Describe the solution you'd like

Updated docs with more clear examples

Describe alternatives you've considered

n/a

Additional context

n/a

Execution perms for getConnections.sh

Trying this out in code spaces, I can't execute getConnections.sh immediately, as the file doesn't have exec permissions, and sudo doesn't exist in code spaces. We should fix this either in permissions (if possible), or document a chmod step in Code Spaces.

image

Not a blocker for go-live

Deploy the groundstation into an existing Azure Virtual Network

Is your feature request related to a problem? Please describe.

Today, the Bicep template deploys a new virtual network, subnet, and network security group.

Describe the solution you'd like

If I have an existing network, I'd like to reuse it by providing it as parameters to the template.

Describe alternatives you've considered

There's a version of this working at glenn/useExistingNetwork by use of a useExistingNetwork parameter: https://github.com/Azure/mock-spacestation/blob/glenn/useExistingNetwork/AzureVM.bicep

Additional context

n/a

configureDestination.sh Custom Script Extension fails due to invalid string replacement

Description
When deploying the mockGroundstation, the custom script extension will fail.

Steps to Reproduce
Steps to reproduce the behavior:

  1. Deploy via the CLI
  2. The mockGroundstation VM deployment will fail

Expected behavior
mockGroundstation VM to be provisioned

Actual behavior
An error occurs because we're overzealous in our docs on configureDestination.sh:

/var/lib/waagent/custom-script/download/0/script.sh: line 43: unexpected EOF while looking for matching `''\n/var/lib/waagent/custom-script/download/0/script.sh: line 89: syntax error: unexpected end of file

Additional context
This can be fixed by removing the single quotes from scripts/configureDestination.sh L6 and scripts/configureSource.sh L11-13

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.