Git Product home page Git Product logo

Comments (2)

riyajatar37003 avatar riyajatar37003 commented on June 2, 2024

16:36:34 [Model Analyzer] DEBUG:
{'always_report_gpu_metrics': False,
'batch_sizes': [1],
'bls_composing_models': [],
'checkpoint_directory': '/app/snow.atg_arch_only.home/users/ariyaz/ml_repos/model_repositories/checkpoints',
'client_max_retries': 50,
'client_protocol': 'grpc',
'collect_cpu_metrics': False,
'concurrency': [],
'config_file': 'config.yaml',
'constraints': {},
'cpu_only_composing_models': [],
'duration_seconds': 3,
'early_exit_enable': False,
'export_path': './profile_results_reranker1',
'filename_model_gpu': 'metrics-model-gpu.csv',
'filename_model_inference': 'metrics-model-inference.csv',
'filename_server_only': 'metrics-server-only.csv',
'genai_perf_flags': {},
'gpu_output_fields': ['model_name',
'gpus': ['all'],
'inference_output_fields': ['model_name',
'latency_budget': None,
'min_throughput': None,
'model_repository': '/app/snow.atg_arch_only.home/users/ariyaz/ml_repos/model_repositories/reranker',
'model_type': 'generic',
'monitoring_interval': 1.0,
'num_configs_per_model': 2,
'num_top_model_configs': 0,
'objectives': {'perf_throughput': 10},
'output_model_repository_path': './rerenker_output1',
'override_output_model_repository': True,
'perf_analyzer_cpu_util': 5120.0,
'perf_analyzer_flags': {},
'perf_analyzer_max_auto_adjusts': 10,
'perf_analyzer_path': 'perf_analyzer',
'perf_analyzer_timeout': 600,
'perf_output': False,
'perf_output_path': None,
'plots': [{'name': 'throughput_v_latency', 'title': 'Throughput vs. Latency', 'x_axis': 'perf_latency_p99', 'y_axis': 'perf_throughput', 'monotonic': True},
{'name': 'gpu_mem_v_latency', 'title': 'GPU Memory vs. Latency', 'x_axis': 'perf_latency_p99', 'y_axis': 'gpu_used_memory', 'monotonic': False}],
'profile_models': [{'model_name': 'bge_reranker_v2_onnx', 'cpu_only': False, 'objectives': {'perf_throughput': 10}, 'parameters': {'batch_sizes': [1], 'concurrency': [], 'request_rate': []}, 'weighting': 1},
{'model_name': 'reranker', 'cpu_only': False, 'objectives': {'perf_throughput': 10}, 'parameters': {'batch_sizes': [1], 'concurrency': [], 'request_rate': []}, 'weighting': 1}],
'reload_model_disable': False,
'request_rate': [],
'request_rate_search_enable': False,
'run_config_profile_models_concurrently_enable': True,
'run_config_search_disable': False,
'run_config_search_max_binary_search_steps': 5,
'run_config_search_max_concurrency': 2,
'run_config_search_max_instance_count': 4,
'run_config_search_max_model_batch_size': 4,
'run_config_search_max_request_rate': 8192,
'run_config_search_min_concurrency': 1,
'run_config_search_min_instance_count': 1,
'run_config_search_min_model_batch_size': 1,
'run_config_search_min_request_rate': 16,
'run_config_search_mode': 'quick',
'server_output_fields': ['model_name',
'skip_detailed_reports': False,
'skip_summary_reports': False,
'triton_docker_args': {},
'triton_docker_image': '',
'triton_docker_labels': {},
'triton_docker_mounts': [],
'triton_docker_shm_size': None,
'triton_grpc_endpoint': 'localhost:8001',
'triton_http_endpoint': 'localhost:8000',
'triton_install_path': '/opt/tritonserver',
'triton_launch_mode': 'local',
'triton_metrics_url': 'http://localhost:8002/metrics',
'triton_output_path': None,
'triton_server_environment': {},
'triton_server_flags': {},
'triton_server_path': 'tritonserver',
'weighting': None}
16:36:34 [Model Analyzer] Initializing GPUDevice handles
16:36:35 [Model Analyzer] Using GPU 0 Tesla V100-SXM2-32GB with UUID GPU-c898354c-1e75-3b40-3c84-2a272ee206c2
16:36:36 [Model Analyzer] WARNING: Overriding the output model repo path "./rerenker_output1"
16:36:36 [Model Analyzer] Starting a local Triton Server
16:36:36 [Model Analyzer] No checkpoint file found, starting a fresh run.
16:36:36 [Model Analyzer] Profiling server only metrics...
16:36:36 [Model Analyzer] DEBUG: Triton Server started.
16:36:46 [Model Analyzer] DEBUG: Stopped Triton Server.
16:36:46 [Model Analyzer]
16:36:46 [Model Analyzer] Starting quick mode search to find optimal configs
16:36:46 [Model Analyzer]
16:36:46 [Model Analyzer] Creating model config: bge_reranker_v2_onnx_config_default
16:36:46 [Model Analyzer]
16:36:46 [Model Analyzer] Creating model config: reranker_config_default
16:36:46 [Model Analyzer]
16:36:58 [Model Analyzer] DEBUG: Triton Server started.
16:37:07 [Model Analyzer] DEBUG: Model bge_reranker_v2_onnx_config_default loaded
16:37:22 [Model Analyzer] DEBUG: Model reranker_config_default loaded
16:37:22 [Model Analyzer] Profiling bge_reranker_v2_onnx_config_default: client batch size=1, concurrency=8
16:37:22 [Model Analyzer] Profiling reranker_config_default: client batch size=1, concurrency=16
16:37:22 [Model Analyzer]
16:37:22 [Model Analyzer] DEBUG: Running ['mpiexec', '--allow-run-as-root', '--tag-output', '-n', '1', 'perf_analyzer', '--enable-mpi', '-m', 'bge_reranker_v2_onnx', '-b', '1', '-u', 'localhost:8001', '-i', 'grpc', '-f', 'bge_reranker_v2_onnx-results.csv', '--verbose-csv', '--concurrency-range', '8', '--measurement-mode', 'count_windows', '--collect-metrics', '--metrics-url', 'http://localhost:8002/metrics', '--metrics-interval', '1000', ':', '-n', '1', 'perf_analyzer', '--enable-mpi', '-m', 'reranker', '-b', '1', '-u', 'localhost:8001', '-i', 'grpc', '-f', 'reranker-results.csv', '--verbose-csv', '--concurrency-range', '16', '--measurement-mode', 'count_windows', '--collect-metrics', '--metrics-url', 'http://localhost:8002/metrics', '--metrics-interval', '1000']
16:37:26 [Model Analyzer] Running perf_analyzer failed with exit status 99:
[1,1]:*** Measurement Settings ***
[1,1]: Batch size: 1
[1,1]: Service Kind: Triton
[1,1]: Using "count_windows" mode for stabilization
[1,1]: Minimum number of samples in each window: 50
[1,1]: Using synchronous calls for inference
[1,1]: Stabilizing using average latency
[1,0]:*** Measurement Settings ***
[1,0]: Batch size: 1
[1,0]: Service Kind: Triton
[1,0]: Using "count_windows" mode for stabilization
[1,0]: Minimum number of samples in each window: 50
[1,0]: Using synchronous calls for inference
[1,0]: Stabilizing using average latency
[1,0]:Request concurrency: 8
[1,1]:Request concurrency: 16
[1,1]:Failed to maintain requested inference load. Worker thread(s) failed to generate concurrent requests.
[1,1]:Thread [0] had error: Failed to process the request(s) for model instance 'reranker_0_4', mes
16:37:26 [Model Analyzer] DEBUG: Measurement for [0, 0, 0, 0]: None.
16:37:26 [Model Analyzer] Saved checkpoint to /app/snow.atg_arch_only.home/users/ariyaz/ml_repos/model_repositories/checkpoints/0.ckpt
16:37:26 [Model Analyzer] Creating model config: bge_reranker_v2_onnx_config_0
16:37:26 [Model Analyzer] Setting instance_group to [{'count': 1, 'kind': 'KIND_GPU'}]
16:37:26 [Model Analyzer] Setting max_batch_size to 1
16:37:26 [Model Analyzer] Enabling dynamic_batching
16:37:26 [Model Analyzer]
16:37:26 [Model Analyzer] Creating model config: reranker_config_0
16:37:26 [Model Analyzer] Setting instance_group to [{'count': 1, 'kind': 'KIND_GPU'}]
16:37:26 [Model Analyzer] Setting max_batch_size to 1
16:37:26 [Model Analyzer] Enabling dynamic_batching
16:37:26 [Model Analyzer]
16:37:31 [Model Analyzer] DEBUG: Stopped Triton Server.
16:37:31 [Model Analyzer] DEBUG: Triton Server started.
16:37:34 [Model Analyzer] DEBUG: Model bge_reranker_v2_onnx_config_0 loaded
16:37:47 [Model Analyzer] DEBUG: Model reranker_config_0 loaded
16:37:47 [Model Analyzer] Profiling bge_reranker_v2_onnx_config_0: client batch size=1, concurrency=2
16:37:47 [Model Analyzer] Profiling reranker_config_0: client batch size=1, concurrency=2
16:37:47 [Model Analyzer]
16:37:47 [Model Analyzer] DEBUG: Running ['mpiexec', '--allow-run-as-root', '--tag-output', '-n', '1', 'perf_analyzer', '--enable-mpi', '-m', 'bge_reranker_v2_onnx', '-b', '1', '-u', 'localhost:8001', '-i', 'grpc', '-f', 'bge_reranker_v2_onnx-results.csv', '--verbose-csv', '--concurrency-range', '2', '--measurement-mode', 'count_windows', '--collect-metrics', '--metrics-url', 'http://localhost:8002/metrics', '--metrics-interval', '1000', ':', '-n', '1', 'perf_analyzer', '--enable-mpi', '-m', 'reranker', '-b', '1', '-u', 'localhost:8001', '-i', 'grpc', '-f', 'reranker-results.csv', '--verbose-csv', '--concurrency-range', '2', '--measurement-mode', 'count_windows', '--collect-metrics', '--metrics-url', 'http://localhost:8002/metrics', '--metrics-interval', '1000']
16:37:51 [Model Analyzer] Running perf_analyzer failed with exit status 99:
[1,0]:*** Measurement Settings ***
[1,0]: Batch size: 1
[1,0]: Service Kind: Triton
[1,0]: Using "count_windows" mode for stabilization
[1,0]: Minimum number of samples in each window: 50
[1,0]: Using synchronous calls for inference
[1,0]: Stabilizing using average latency
[1,1]:*** Measurement Settings ***
[1,1]: Batch size: 1
[1,1]: Service Kind: Triton
[1,1]: Using "count_windows" mode for stabilization
[1,1]: Minimum number of samples in each window: 50
[1,1]: Using synchronous calls for inference
[1,1]: Stabilizing using average latency
[1,0]:Request concurrency: 2
[1,1]:Request concurrency: 2
[1,1]:Failed to maintain requested inference load. Worker thread(s) failed to generate concurrent requests.
[1,1]:Thread [0] had error: [request id: <id_unknown>] Exceeds maximum queue size
16:37:51 [Model Analyzer] No changes made to analyzer data, no checkpoint saved.
16:37:56 [Model Analyzer] DEBUG: Stopped Triton Server.
Traceback (most recent call last):
File "/opt/app_venv/bin/model-analyzer", line 8, in
File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/", line 278, in main
File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/", line 124, in profile
File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/", line 233, in _profile_models
File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/", line 145, in run_models
File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/", line 239, in _stop_ma_if_no_valid_measurement_threshold_reached
raise TritonModelAnalyzerException(
model_analyzer.model_analyzer_exceptions.TritonModelAnalyzerException: The first 2 attempts to acquire measurements have failed. Please examine the Tritonserver/PA error logs to determine what has gone wrong.

from server.

ganeshku1 avatar ganeshku1 commented on June 2, 2024

Can you please provide the details of the bug suing our bug report here:

from server.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.