Git Product home page Git Product logo

Comments (2)

riyajatar37003 avatar riyajatar37003 commented on June 2, 2024

16:36:34 [Model Analyzer] DEBUG:
{'always_report_gpu_metrics': False,
'batch_sizes': [1],
'bls_composing_models': [],
'checkpoint_directory': '/app/snow.atg_arch_only.home/users/ariyaz/ml_repos/model_repositories/checkpoints',
'client_max_retries': 50,
'client_protocol': 'grpc',
'collect_cpu_metrics': False,
'concurrency': [],
'config_file': 'config.yaml',
'constraints': {},
'cpu_only_composing_models': [],
'duration_seconds': 3,
'early_exit_enable': False,
'export_path': './profile_results_reranker1',
'filename_model_gpu': 'metrics-model-gpu.csv',
'filename_model_inference': 'metrics-model-inference.csv',
'filename_server_only': 'metrics-server-only.csv',
'genai_perf_flags': {},
'gpu_output_fields': ['model_name',
'gpu_uuid',
'batch_size',
'concurrency',
'model_config_path',
'instance_group',
'satisfies_constraints',
'gpu_used_memory',
'gpu_utilization',
'gpu_power_usage'],
'gpus': ['all'],
'inference_output_fields': ['model_name',
'batch_size',
'concurrency',
'model_config_path',
'instance_group',
'max_batch_size',
'satisfies_constraints',
'perf_throughput',
'perf_latency_p99'],
'latency_budget': None,
'min_throughput': None,
'model_repository': '/app/snow.atg_arch_only.home/users/ariyaz/ml_repos/model_repositories/reranker',
'model_type': 'generic',
'monitoring_interval': 1.0,
'num_configs_per_model': 2,
'num_top_model_configs': 0,
'objectives': {'perf_throughput': 10},
'output_model_repository_path': './rerenker_output1',
'override_output_model_repository': True,
'perf_analyzer_cpu_util': 5120.0,
'perf_analyzer_flags': {},
'perf_analyzer_max_auto_adjusts': 10,
'perf_analyzer_path': 'perf_analyzer',
'perf_analyzer_timeout': 600,
'perf_output': False,
'perf_output_path': None,
'plots': [{'name': 'throughput_v_latency', 'title': 'Throughput vs. Latency', 'x_axis': 'perf_latency_p99', 'y_axis': 'perf_throughput', 'monotonic': True},
{'name': 'gpu_mem_v_latency', 'title': 'GPU Memory vs. Latency', 'x_axis': 'perf_latency_p99', 'y_axis': 'gpu_used_memory', 'monotonic': False}],
'profile_models': [{'model_name': 'bge_reranker_v2_onnx', 'cpu_only': False, 'objectives': {'perf_throughput': 10}, 'parameters': {'batch_sizes': [1], 'concurrency': [], 'request_rate': []}, 'weighting': 1},
{'model_name': 'reranker', 'cpu_only': False, 'objectives': {'perf_throughput': 10}, 'parameters': {'batch_sizes': [1], 'concurrency': [], 'request_rate': []}, 'weighting': 1}],
'reload_model_disable': False,
'request_rate': [],
'request_rate_search_enable': False,
'run_config_profile_models_concurrently_enable': True,
'run_config_search_disable': False,
'run_config_search_max_binary_search_steps': 5,
'run_config_search_max_concurrency': 2,
'run_config_search_max_instance_count': 4,
'run_config_search_max_model_batch_size': 4,
'run_config_search_max_request_rate': 8192,
'run_config_search_min_concurrency': 1,
'run_config_search_min_instance_count': 1,
'run_config_search_min_model_batch_size': 1,
'run_config_search_min_request_rate': 16,
'run_config_search_mode': 'quick',
'server_output_fields': ['model_name',
'gpu_uuid',
'gpu_used_memory',
'gpu_utilization',
'gpu_power_usage'],
'skip_detailed_reports': False,
'skip_summary_reports': False,
'triton_docker_args': {},
'triton_docker_image': 'nvcr.io/nvidia/tritonserver:24.04-py3',
'triton_docker_labels': {},
'triton_docker_mounts': [],
'triton_docker_shm_size': None,
'triton_grpc_endpoint': 'localhost:8001',
'triton_http_endpoint': 'localhost:8000',
'triton_install_path': '/opt/tritonserver',
'triton_launch_mode': 'local',
'triton_metrics_url': 'http://localhost:8002/metrics',
'triton_output_path': None,
'triton_server_environment': {},
'triton_server_flags': {},
'triton_server_path': 'tritonserver',
'weighting': None}
16:36:34 [Model Analyzer] Initializing GPUDevice handles
16:36:35 [Model Analyzer] Using GPU 0 Tesla V100-SXM2-32GB with UUID GPU-c898354c-1e75-3b40-3c84-2a272ee206c2
16:36:36 [Model Analyzer] WARNING: Overriding the output model repo path "./rerenker_output1"
16:36:36 [Model Analyzer] Starting a local Triton Server
16:36:36 [Model Analyzer] No checkpoint file found, starting a fresh run.
16:36:36 [Model Analyzer] Profiling server only metrics...
16:36:36 [Model Analyzer] DEBUG: Triton Server started.
16:36:46 [Model Analyzer] DEBUG: Stopped Triton Server.
16:36:46 [Model Analyzer]
16:36:46 [Model Analyzer] Starting quick mode search to find optimal configs
16:36:46 [Model Analyzer]
16:36:46 [Model Analyzer] Creating model config: bge_reranker_v2_onnx_config_default
16:36:46 [Model Analyzer]
16:36:46 [Model Analyzer] Creating model config: reranker_config_default
16:36:46 [Model Analyzer]
16:36:58 [Model Analyzer] DEBUG: Triton Server started.
16:37:07 [Model Analyzer] DEBUG: Model bge_reranker_v2_onnx_config_default loaded
16:37:22 [Model Analyzer] DEBUG: Model reranker_config_default loaded
16:37:22 [Model Analyzer] Profiling bge_reranker_v2_onnx_config_default: client batch size=1, concurrency=8
16:37:22 [Model Analyzer] Profiling reranker_config_default: client batch size=1, concurrency=16
16:37:22 [Model Analyzer]
16:37:22 [Model Analyzer] DEBUG: Running ['mpiexec', '--allow-run-as-root', '--tag-output', '-n', '1', 'perf_analyzer', '--enable-mpi', '-m', 'bge_reranker_v2_onnx', '-b', '1', '-u', 'localhost:8001', '-i', 'grpc', '-f', 'bge_reranker_v2_onnx-results.csv', '--verbose-csv', '--concurrency-range', '8', '--measurement-mode', 'count_windows', '--collect-metrics', '--metrics-url', 'http://localhost:8002/metrics', '--metrics-interval', '1000', ':', '-n', '1', 'perf_analyzer', '--enable-mpi', '-m', 'reranker', '-b', '1', '-u', 'localhost:8001', '-i', 'grpc', '-f', 'reranker-results.csv', '--verbose-csv', '--concurrency-range', '16', '--measurement-mode', 'count_windows', '--collect-metrics', '--metrics-url', 'http://localhost:8002/metrics', '--metrics-interval', '1000']
16:37:26 [Model Analyzer] Running perf_analyzer failed with exit status 99:
[1,1]:*** Measurement Settings ***
[1,1]: Batch size: 1
[1,1]: Service Kind: Triton
[1,1]: Using "count_windows" mode for stabilization
[1,1]: Minimum number of samples in each window: 50
[1,1]: Using synchronous calls for inference
[1,1]: Stabilizing using average latency
[1,1]:
[1,0]:*** Measurement Settings ***
[1,0]: Batch size: 1
[1,0]: Service Kind: Triton
[1,0]: Using "count_windows" mode for stabilization
[1,0]: Minimum number of samples in each window: 50
[1,0]: Using synchronous calls for inference
[1,0]: Stabilizing using average latency
[1,0]:
[1,0]:Request concurrency: 8
[1,1]:Request concurrency: 16
[1,1]:Failed to maintain requested inference load. Worker thread(s) failed to generate concurrent requests.
[1,1]:Thread [0] had error: Failed to process the request(s) for model instance 'reranker_0_4', mes
16:37:26 [Model Analyzer] DEBUG: Measurement for [0, 0, 0, 0]: None.
16:37:26 [Model Analyzer] Saved checkpoint to /app/snow.atg_arch_only.home/users/ariyaz/ml_repos/model_repositories/checkpoints/0.ckpt
16:37:26 [Model Analyzer] Creating model config: bge_reranker_v2_onnx_config_0
16:37:26 [Model Analyzer] Setting instance_group to [{'count': 1, 'kind': 'KIND_GPU'}]
16:37:26 [Model Analyzer] Setting max_batch_size to 1
16:37:26 [Model Analyzer] Enabling dynamic_batching
16:37:26 [Model Analyzer]
16:37:26 [Model Analyzer] Creating model config: reranker_config_0
16:37:26 [Model Analyzer] Setting instance_group to [{'count': 1, 'kind': 'KIND_GPU'}]
16:37:26 [Model Analyzer] Setting max_batch_size to 1
16:37:26 [Model Analyzer] Enabling dynamic_batching
16:37:26 [Model Analyzer]
16:37:31 [Model Analyzer] DEBUG: Stopped Triton Server.
16:37:31 [Model Analyzer] DEBUG: Triton Server started.
16:37:34 [Model Analyzer] DEBUG: Model bge_reranker_v2_onnx_config_0 loaded
16:37:47 [Model Analyzer] DEBUG: Model reranker_config_0 loaded
16:37:47 [Model Analyzer] Profiling bge_reranker_v2_onnx_config_0: client batch size=1, concurrency=2
16:37:47 [Model Analyzer] Profiling reranker_config_0: client batch size=1, concurrency=2
16:37:47 [Model Analyzer]
16:37:47 [Model Analyzer] DEBUG: Running ['mpiexec', '--allow-run-as-root', '--tag-output', '-n', '1', 'perf_analyzer', '--enable-mpi', '-m', 'bge_reranker_v2_onnx', '-b', '1', '-u', 'localhost:8001', '-i', 'grpc', '-f', 'bge_reranker_v2_onnx-results.csv', '--verbose-csv', '--concurrency-range', '2', '--measurement-mode', 'count_windows', '--collect-metrics', '--metrics-url', 'http://localhost:8002/metrics', '--metrics-interval', '1000', ':', '-n', '1', 'perf_analyzer', '--enable-mpi', '-m', 'reranker', '-b', '1', '-u', 'localhost:8001', '-i', 'grpc', '-f', 'reranker-results.csv', '--verbose-csv', '--concurrency-range', '2', '--measurement-mode', 'count_windows', '--collect-metrics', '--metrics-url', 'http://localhost:8002/metrics', '--metrics-interval', '1000']
16:37:51 [Model Analyzer] Running perf_analyzer failed with exit status 99:
[1,0]:*** Measurement Settings ***
[1,0]: Batch size: 1
[1,0]: Service Kind: Triton
[1,0]: Using "count_windows" mode for stabilization
[1,0]: Minimum number of samples in each window: 50
[1,0]: Using synchronous calls for inference
[1,0]: Stabilizing using average latency
[1,0]:
[1,1]:*** Measurement Settings ***
[1,1]: Batch size: 1
[1,1]: Service Kind: Triton
[1,1]: Using "count_windows" mode for stabilization
[1,1]: Minimum number of samples in each window: 50
[1,1]: Using synchronous calls for inference
[1,1]: Stabilizing using average latency
[1,1]:
[1,0]:Request concurrency: 2
[1,1]:Request concurrency: 2
[1,1]:Failed to maintain requested inference load. Worker thread(s) failed to generate concurrent requests.
[1,1]:Thread [0] had error: [request id: <id_unknown>] Exceeds maximum queue size
[1,1]:
[1,
16:37:51 [Model Analyzer] No changes made to analyzer data, no checkpoint saved.
16:37:56 [Model Analyzer] DEBUG: Stopped Triton Server.
Traceback (most recent call last):
File "/opt/app_venv/bin/model-analyzer", line 8, in
sys.exit(main())
File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/entrypoint.py", line 278, in main
analyzer.profile(
File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/analyzer.py", line 124, in profile
self._profile_models()
File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/analyzer.py", line 233, in _profile_models
self._model_manager.run_models(models=models)
File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/model_manager.py", line 145, in run_models
self._stop_ma_if_no_valid_measurement_threshold_reached()
File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/model_manager.py", line 239, in _stop_ma_if_no_valid_measurement_threshold_reached
raise TritonModelAnalyzerException(
model_analyzer.model_analyzer_exceptions.TritonModelAnalyzerException: The first 2 attempts to acquire measurements have failed. Please examine the Tritonserver/PA error logs to determine what has gone wrong.

from server.

ganeshku1 avatar ganeshku1 commented on June 2, 2024

@riyajatar37003
Can you please provide the details of the bug suing our bug report here: https://github.com/triton-inference-server/server/issues/new/choose

from server.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.