google / gtest-parallel Goto Github PK
View Code? Open in Web Editor NEWRun Google Test suites in parallel.
License: Apache License 2.0
Run Google Test suites in parallel.
License: Apache License 2.0
Hi,
Some software behaves differently in a TTY vs not. gtest-parallel
makes the tests think they are not running interactively. It would be neat to add a mode to spawn a pty for each test.
I have a patch that I wrote, but it is probably not in a place to be merged. When I run it, sometimes I have issues with my terminal after. I'm not sure if I need to change modes with the tty
module or something.
Here is the patch for your consideration. Feel free to take it as-is or modify it to your needs or completely ignore it.
commit 35514e9cd340153fcba481d66b6d073da42fd64f (HEAD -> master)
Author: Max Bernstein <[email protected]>
Date: Tue May 31 01:43:18 2022 -0700
Add --pty option
If --pty is given, run each test with a pty.
diff --git a/gtest_parallel.py b/gtest_parallel.py
index d682dbe..185d8c7 100755
--- a/gtest_parallel.py
+++ b/gtest_parallel.py
@@ -167,6 +167,11 @@ def get_save_file_path():
return os.path.join(os.path.expanduser('~'), '.gtest-parallel-times')
+def with_pty(command):
+ return [sys.executable, '-c', 'import pty, sys; pty.spawn(sys.argv[1:])',
+ *command]
+
+
@total_ordering
class Task(object):
"""Stores information about a task (single execution of a test).
@@ -180,13 +185,17 @@ class Task(object):
"""
def __init__(self, test_binary, test_name, test_command, execution_number,
- last_execution_time, output_dir):
+ last_execution_time, output_dir, pty):
self.test_name = test_name
self.output_dir = output_dir
self.test_binary = test_binary
- self.test_command = test_command
+ if pty:
+ self.test_command = test_command
+ else:
+ self.test_command = with_pty(test_command)
self.execution_number = execution_number
self.last_execution_time = last_execution_time
+ self.pty = pty
self.exit_code = None
self.runtime_ms = None
@@ -643,7 +652,7 @@ def find_tests(binaries, additional_args, options, times):
for execution_number in range(options.repeat):
tasks.append(
Task(test_binary, test_name, test_command, execution_number + 1,
- last_execution_time, options.output_dir))
+ last_execution_time, options.output_dir, options.pty))
test_count += 1
@@ -780,6 +789,10 @@ def default_options_parser():
default=False,
help='Do not run tests from the same test '
'case in parallel.')
+ parser.add_option('--pty',
+ action='store_true',
+ default=False,
+ help='Give each test a pty.')
return parser
Cheers,
Max
Hi,
When running the script , the command of the test continue only by pressing a key.
There are test cases which gets failed in parallel, it gets difiicult to debug if output is suppresed on the console. We need that functionality to use it. I agree it adds bvalue as it enabled me to run tests in parallel. But giving output on screen will more or less help in debugging even if it is printing on screen at different timings due to speed of execution. It will be a pie on the cake.
I've been hit by this at least 2 times.
Evidently, on Windows, there is a restriction of MAX_PATH (260 characters) FS path length.
On encrypted partition in Ubuntu (using eCryptfs). See this comment. Since eCryptfs protects not only file's content but also file names then there is an approximate restriction of 140 characters file name length.
There probably other cases when gtest-parallel can fail writing log file due to file/path length restrictions.
Now I'm considering options for fixing these issues all at once:
Make writing log files configurable and if it's disabled then write everything to console. Drawback: this will clutter the output.
Restrict file/path length. It seems we discussed this previously. For example, use hash of test name as log file name. Drawback: difficult to find log files for each specific test.
Try to use system extensions for writing longer file names. While this can be done on Windows it will not work for eCryptfs or any other FS without such extensions.
Propose yours.
I'm in favor of option 1. Which option is acceptable for gtest-parallel?
Using gtest-parallel in projects with long test names (e.g. https://github.com/openvinotoolkit/openvino) I found it does not create log files in --output_dir for the tests exceed OS filename length limits. This blocks using gtest-parallel in Continuous Integration because logs are not available for issues analysis.
There was an issue tackling the similar problem but still not making logs available #57
Suppose we have only one test to run, which will stuck forever, for whatever reason.
Running the gtest-parallel by assigning --timeout=x
will be also stuck forever, instead of finished in time and reporting that test to be timed out.
The problem occurs because we wait the process to be finished without a time out:
gtest-parallel/gtest_parallel.py
Line 92 in f4d65b5
A solution might be https://stackoverflow.com/a/10012262/4597218
Suppose we have timeout = 5s, and the following timeline:
0s: Started test1, test2, test3
1s: Running
2s: Running
3s: test1, test2, test3 finished, start test4, test 5
4s: Running
5s: Timedout
6s: test4 and test5 finished
In this case, gtest-parallel incorrectly prints test4 and test5 to have running time for 5s. Their correct running time is 3s.
We noticed in our build system when upgrading to a new version of gtest-parallel that the --output_dir
handling results in this kind of error:
Traceback (most recent call last):
File "/opt/bb/bin/gtest-parallel.py", line 18, in <module>
sys.exit(gtest_parallel.main())
File "/opt/bb/bin/gtest_parallel.py", line 745, in main
logger.move_to('passed', task_manager.passed)
File "/opt/bb/bin/gtest_parallel.py", line 294, in move_to
os.makedirs(destination_dir)
File "/opt/bb/lib/python2.7/os.py", line 157, in makedirs
mkdir(name, mode)
OSError: [Errno 17] File exists: '/tmp/gtest-parallel-logs/passed'
Regardless of this particular bug, invoking gtest-parallel now leaves stuff around in the temporary directory forever (until that dir is cleaned).
This didn't seem to be an issue in older versions of gtest-parallel.
I took a look at the source, and couldn't see a very straightforward way to disable the output dir logging - is it critical to the functionality of the latest version? If so, perhaps we should consider making --output_dir
default to None
, and in such case using a proper generated temporary directory by default which is then cleaned up on exit.
Thoughts?
Thanks
Each time my particular test executable initializes, it scans a directory tree to auto-register tests based on filenames found in that directory. This adds a non-trivial startup cost to running just a single test, and when running through gtest-parallel, this startup cost is paid thousands of times. It would be great if gtest-parallel could run tests in groups so that this startup cost could be paid only once per group.
Implementation wise, I can see 2 grades of implementation:
Thanks for any feedback you have on this suggestion, including pointers on how to go about implementing this.
Edit: realized that individual test runtimes are still available after second execution.
Apache License V2.0 requires the LICENSE to have the year and name of the copyright owner. The values are present in LICENSE file and have to be edited when applied to the projects as state in the LICENSE as shown below. I think it was missed for this project.
Copyright [yyyy] [name of copyright owner]
has to be changed to Copyright 2013 Google Inc.
.
This looks like a very useful utility but unfortunately I could not get it to work. I cloned the code and tried invoking the script both via 'python gtest_parallel.py` and via the script and nothing happens : no output, no error, nothing. I tried with and without arguments, same results. I tried this on a Mac (10.12.6) with Python 2.7.10 and Windows 10 with Ptyhon 2.7.8.
Follow-up task to #38. This path should be tested.
Currently timeout is global. When timeout is reached, all running test cases are marked as "interrupted", and they are reported as FAILED TESTS.
These reported failed cases may not make sense, because these tests could finish quickly. They were just not started at the right moment -- they don't have enough time quota to finish until reaching the global time limit.
It may be better to implement a per-test-case timeout instead.
threre are some muti-threads test cases in my executable, can i use or benefit from gtest-parallel?
thanks
I am running on Ubuntu 20.04. The main python process seems to hang there waiting for some signals. see my process below, I can see all the 7 tests finished correctly, I can see their log in the log folder. However, see below
0 1000 128529 119191 20 0 614472 13988 futex_ Sl+ pts/1 0:00 python ../ext/gtest-parallel/gtest_parallel.py --output_dir=. ./distr/syb-test
0 1000 128539 128529 20 0 0 0 - Zl+ pts/1 0:02 [test] <defunct>
0 1000 128540 128529 20 0 0 0 - Zl+ pts/1 0:03 [test] <defunct>
0 1000 128541 128529 20 0 0 0 - Zl+ pts/1 0:04 [test] <defunct>
0 1000 128543 128529 20 0 0 0 - Zl+ pts/1 0:04 [test] <defunct>
0 1000 128544 128529 20 0 0 0 - Zl+ pts/1 0:00 [test] <defunct>
0 1000 128545 128529 20 0 0 0 - Zl+ pts/1 0:02 [test] <defunct>
0 1000 128546 128529 20 0 0 0 - Zl+ pts/1 0:00 [test] <defunct>
128529 is the main process id and those associated processes hang there.
How to debug this?
When tests have a large amount of overhead, and the test itself is fast, it would be nice to use batching to run multiple tests for each task.
In my use case it takes about 2+ seconds to start testing, and we have ~400 tests, each taking .4 seconds to run.
Running with 1 test per task it only results in a 2x speed up. Being able to batch to 8 tasks should give an approx 10x speedup.
I would just implement it as a parameter like --batch_size=8
Rationale: when tracking flaky tests, --repeat is useful, but may be wasteful (especially in continuous integration). In most cases, it would be enough to stop at first failure.
May we add a flag to have this behavior, e.g. --stop-repeat-after-failure?
Currently, if a user sets --output_dir=/path/to/important/data
, gtest-parallel
will clear that out from any files to remove files from old test runs.
Some different solutions would be:
1: Warn if the directory exists, but does not contain a .gtest-parallel-logs
file (that we would start saving). This would be a bad upgrade path for existing directories, since it would require user intervention.
2: Log to gtest-parallel/
subdirectory, and remove logs from in here. This would change the --output_dir
default to /tmp
instead of /tmp/gtest-parallel
, which would then be joined into /tmp/gtest-parallel
in the end. This heuristic would only fail if a user does --output_dir=path/to/git-dir
when it contains gtest-parallel, which I think we can ignore.
3: We could also call this subdirectory in 2 gtest-parallel-logs
, which should never overlap.
4: Only remove *.log
, but eew.
5: Refuse to log to any directories that contain non-*.log
files. Also eew.
I don't like 1, because if you run it once to /path/to/important/data, select "ok", and then run it again, we nuke important data anyways. I think I like 2 or 3 best, not sure which. Both these solutions require that users know that we log to a subdirectory. Though given that we do passed/
, failed/
and interrupted
subdirectories already this might be OK, but it depends on if users are scripting against known subdirectories. @ehlemur, do you make use of how subdirectories under --output_dir
are organized?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.