Git Product home page Git Product logo

bug-free-fortnight's Introduction

Build Status

Running tests for GPDB

This repository is a simple on-ramp to help contributors run tests (installcheck) for GPDB

Prerequisites

  1. Hack on your code, commit them locally

  2. Assuming all your code repositories are checked out in the same directory locally (e.g. ~/workspace). Specifically, the following repositories should be checked out locally (how else would you hack on them?)

Just tell me how

  1. ~/workspace/bug-free-fortnight/streamline-master/uber.bash

FAQ

  1. Where's my container?

    We label the images and hence the containers. Try filtering like this:

    docker ps --filter label=io.github.d.uber-script
    
  2. It's too noisy!

    We've fixed that by turning off most of the diagnostic output from Bash

  3. It's too quiet!

    Set the DEBUG environment variable to reinstate debug output, e.g. env DEBUG=1 streamline-master/uber.bash

  4. How do I set a GUC when running installcheck?

    Run with the --interactive flag first, e.g.

    streamline-43/uber.bash --interactive
    

    It will stop after starting the cluster, and you can follow the prompt to set any GUC before running make installcheck

  5. ICG failed, but uber script deletes the container! How do I look at the diff against expected output?

    regression.diffs is always copied out when tests fail, try to get the most out of that.

  6. uber script deletes the container when my tests fail! How do I attach to shit and debug?

    If you need to debug after the tests fail (so you know which regress test to re-run), run with the --interactive-after-icg flag

    streamline-43/uber.bash --interactive-after-icg
    

    It will run ICG, then stop in an interactive Bash prompt.

  7. Shit's SLOW

    If you are using Docker for Mac, don't.

  8. Shit don't work

    Please turn on debug output and attach the debug output when you ask for help

License

See the LICENSE file for license rights and your freedom (GPL v3)

bug-free-fortnight's People

Contributors

cramja avatar d avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

bug-free-fortnight's Issues

A Contributor should be able to run `installcheck` interactively

User Story

As a contributor to Greenplum Database
I'd like to have ad-hoc interaction with my compiled database
So that I can have confidence in my changes
And I can stop making Jesse sad by forking pipelines

Scenario

Given I have my changes to zero or more of:

  1. GPOS
  2. ORCA
  3. GPDB

When I run streamline-master/uber.bash,
Then it will build the database and set up the cluster
And it will prompt me how to get an interactive shell into the container, where I can run psql or make installcheck-good

Background

Currently uber.bash does all of these 4 things:
0. it builds the universe
0. it initializes a cluster
0. it starts the cluster
0. it runs installcheck with ORCA

See if we can split that last guy off

label the containers

This comes up whenever I think about #37 . The nice thing about our current approach seems to be that containers have unique names. The problem with that approach is it becomes harder to refer to the container "from the outside", e.g. when you need to docker exec or when you want to start a privileged container sharing some namespace (pid, ipc, or network) with the Greenplum container.

This does not entirely supplant #37 but should enable a couple use cases that #37 was meant to address.

name the container?

This might enable two use cases:

  1. Getting another shell in the container by docker exec -ti <name> /bin/bash
  2. Running psql directly against the database with something like
    docker run --rm -ti --link <name>:gpdb postgres psql -h gpdb -p 15432 -U gpadmin -d postgres
    or
    docker run --rm -ti --net container:<name> postgres psql -h localhost -p 15432 -U gpadmin -d postgres

Write tests for the images

Given the declarative nature of Dockerfile, it may be tempted to just not test them. However, in times of big changes, I still sorely miss having some tests (with clear intents of course).

`make instalcheck` doesn't work due to python spookiness

click to expand error message

env PGOPTIONS='-c optimizer=on' make installcheck-good
make -C src/test installcheck-good
make[1]: Entering directory `/build/gpdb/src/test'
make -C regress installcheck-good
make[2]: Entering directory `/build/gpdb/src/test/regress'
make -C ../../../src/port all
make[3]: Entering directory `/build/gpdb/src/port'
make[3]: Nothing to be done for `all'.
make[3]: Leaving directory `/build/gpdb/src/port'
make -C ../../../contrib/spi refint.so autoinc.so
make[3]: Entering directory `/build/gpdb/contrib/spi'
make[3]: `refint.so' is up to date.
make[3]: `autoinc.so' is up to date.
make[3]: Leaving directory `/build/gpdb/contrib/spi'
rm -rf ./testtablespace
mkdir ./testtablespace
./checkinc.py
'import site' failed; use -v for traceback
Traceback (most recent call last):
  File "./checkinc.py", line 3, in <module>
    import sys, os, re, subprocess
ImportError: No module named os
make[2]: *** [includecheck] Error 1
make[2]: Leaving directory `/build/gpdb/src/test/regress'
make[1]: *** [installcheck-good] Error 2
make[1]: Leaving directory `/build/gpdb/src/test'
make: *** [installcheck-good] Error 2
@oarap and @hsyuan found this out

I feel anxious when text is not scrolling on my screen

My buddy @xinzweb, who's an existential thinker, feels the enormous emptiness of life when the deep dark terminal screen is not animating. All of a sudden, every second feels like 10 years. He turned on env DEBUG=1, but there are still a few places that are not noisy enough:

  1. docker building an image
  2. compiling ORCA
  3. any other place that's not producing pointless output

Let's make him happy

Silent make output

It didn't feel like the output of make commands provided much value, consider silencing it with make --silent, at least when DEBUG is unset?

Unit test compilation doesn't parallelize well

On a fast (like 2016) system, running unit tests with make -s -j8 -C /build/gpdb/src/backend unittest-check has a high chance of failing compilation because the test directories stampede on each other when trying to build the mocks.

Debug build support?

User story

As a Greenplum database core maintainer,
I want to a consistent way to verify pull requests,
which gives me the same level of confidence as a forked CI pipeline would have given me

[Request for Comments] Prefer Git-Clone over Direct Source Sharing

Hello the ORCA and Greenplum community,
This is my first try at doing an RFC.

Background

I don't know about you but besides Linux, I also use a Mac, and boy it is a slow one (2012 MacBook Air, 2 physical cores with 4 "hyperthreading" cores). The uber script currently works in the following way:

  1. The uber script mounts the orca source code at /workspace/gporca as a read-only volume. Similarly, other repositories needed are available readonly under /workspace.

  2. For orca, gp-xerces, and (optionally) gpos, it uses out-of-tree build, generating a build tree outside the source tree, but referring to source tree from /workspace/gporca. The container path /workspace/gporca is a read-only mount from our Macs.

  3. For the Postgres-based server of Greenplum, it clones the Greenplum database source code from /workspace/gpdb to inside the container at /build/gpdb, and generates the build tree in-tree at /build/gpdb.

The original intent was to use out-of-tree build as much as possible because we envisioned a workflow where a developer could make a 3-line change, and build the entire product in a few seconds. And the clone-followed-by-in-tree build method for Greenplum code base was purely incidental, mostly motivated by the fact that until Postgres 9.0, the autoconf build system in Postgres wasn't robust enough to support out-of-tree build.

The only pain here, is the fact that when we are not working on a Linux workstation, the file system used for /workspace/orca and friends is one of the slowest (fuse-based osxfs, vboxsf for Virtual Box, or hgfs for VMware). An empirical evidence is that the uber script builds way way faster on a underpowered 2-core Linux host than a 8-core Mac host:

Host OS Time
Linux 57s
Darwin(macOS) 22m10s

Proposal

I propose we switch to cloning ORCA into the container before building, just like how we build the Greenplum server. This is not strictly an improvement, but a trade-off:

Pros

Vastly faster builds on Macs, and in any situation that use a desktop-grade VM to host Docker really. Here's a new piece of anecdotal evidence:

Method Orca build time Warm cache?
Don't clone (current) 22m34.7s warm
Clone (proposed) 6m47.181s cold
Clone (proposed) 1m51.933s warm

Cons

The following workflow that we originally envisioned will stop working:

  1. Make some small changes to Orca on your Mac, don't commit
  2. Run Uber script to spin up a cluster running with your changes

Instead, we need to commit our changes before running the uber script. Is this acceptable?

Other concerns

  • Can everybody take it for a spin and report back the timing?
  • Documentation update

ORCA compilation does not properly clean up

To reproduce, run uber.bash, and press Ctrl-C during ORCA compilation. This only interrupts the docker client that is waiting for the container that's building ORCA, but it does not really stop or remove the build container.

Run ICG without optimizer?

Given that Omer, an ORCA contributor is fixing a planner issues
When Omer runs uber.bash --no-optimizer
Then uber script should run installcheck without turning on optimizer.

Document running perf against the database

There are a few scenarios where we found it really valuable to run the linux perf tools to profile the databases:

  1. When we recently found out that mirrors go nuts in the middle of installcheck-good, and we want to be good citizens and collect perf data for the storage team
  2. When we are thinking about hot spots and bottlenecks in ORCA, and want a scientific way to profile it
  3. In the context of some C++ refactoring, we want to look into using perf probe to accurately count the number of invocations of certain functions that we aim to hide away using resource wrappers, but were afraid of introducing performance regressions (the plan B being using a global volatile int64_t to just count ...)

This is not impossible but might be painful without either #41 or #37

A contributor should not need to copy-pasta a separate command to interact with the database

User Story

As a lazy person who contributes to Greenplum Database,
I'd like to run a single script to have all my code built,
and my database cluster initialized,
AND I'd like to be able to proceed to testing after that.

Scenario

Given my code changes to zero or more of

  1. ORCA
  2. GPOS
  3. GPDB

When I run streamline-master/uber.bash --interactive
After it builds the universe and it initializes the database,
Then it presents me with an interactive shell, which is

  1. primed with the right environment (PGPORT, and possibly PATH)
  2. cd'd into src/test/regress

document the hack to pull new code from host

When I'm using a Mac I find myself doing a trick inside the container to incrementally build:

git fetch origin HEAD && git reset --hard FETCH_HEAD && make install -s -j8 -l12 -C /build/gpdb && gpstop -ari

I heard @hardikar complaining about this today, might as well document it

issue with apt-get update on pg branch

When I run ~/bug-free-fortnight/pg/uber.bash --interactive on the pg branch, I get the following error

Get:91 http://archive.ubuntu.com/ubuntu/ trusty-updates/main libc6-dbg amd64 2.19-0ubuntu6.13 [3462 kB]
Fetched 46.2 MB in 24s (1887 kB/s)
E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/c/curl/libcurl3-gnutls_7.35.0-1ubuntu2.11_amd64.deb  404  Not Found [IP: 91.189.88.161 80]

E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/l/linux/linux-libc-dev_3.13.0-133.182_amd64.deb  404  Not Found [IP: 91.189.91.26 80]

E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
The command '/bin/sh -c apt-get install -y 	gdb 	ccache 	make 	gcc 	bison 	flex 	vim 	zlib1g-dev 	libreadline-dev 	runit 	byobu 	git 	rsync' returned a non-zero code: 100

Switch to Ubuntu rolling releases (Currently 17.10 artful)

17.04 zesty was an awkward short-living release (like all non-LTS rolling releases). It was already out of the "rolling" designation by the time we chose it. And we only chose it because of new compiler warnings from GCC 7. Now that these have all been fixed, let's keep
rolling forward!

This should enable us to automatically upgrade to future rolling releases should they become available.

Show Xerces, GPOS, or ORCA compilation output only when it fails

I am starting to feel there's too much noise in the output, especially considering that I get diminishing value out of the output: what do I get out of the quickly scrolling screen? The feeling that time is elapsing?

I have a few options to make this more quiet, but before even shaving this yak, I want to gather some feedback from you guys, @oarap @xinzweb and @hsyuan:

  1. Do you feel strongly about this? Choose one out of three:
    • I hate it, it's so noisy!!!
    • I love it, noise gives me a good existential feeling!
    • Noise. Meh. C'est la vie
  2. Am I the only one annoyed by the ORCA compilation output at the beginning of the build process?
  3. If you find it valuable, can you leave a comment below and let me know what value you get out of the output
  4. If you feel "meh", can you help me decide which improvement item gives you more value:
    1. debug build
    2. suppress ORCA output unless it fails

Orca should be built matching CI

User Story

We should build GPOS and ORCA with exactly the same toolchain and image as in CI
Before we experiment with bumping the compiler version independently
Because this allows us to draw confidence in the installcheck test run that artifacts from different compilers actually interoperate correctly

Run greenplum unit tests

We've all been there, breaking 'em unintentionally. Wouldn't it be nice if we'd known better?

Rewrite this in C++

Or something that can be better tested ...

I'm gonna sound very pretentious and claim that I'm a Bash guru. However, I'm also very fed up with every surprise the Bash (and POSIX sh overall) gives to everybody else. The cost of doing the right thing in Bash is too high that I'm willing to write this (maybe) in C++...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.