Git Product home page Git Product logo

iotedge-logging-and-monitoring-solution's Issues

SSL Connection to Log Analytics Workflow: Authentication failed

The given function works well locally but after deploying as dotnet-isolated function app, CollectMatrics function is unable to connect to LA workflow and throws below exception. I can confirm that the configuration such as workflow id and key is correct.

2022-12-11T08:07:15Z [Information] OMS endpoint Url : https://e5beb42e-bfec-48d2-b7f9-e262b2ddf088.oms.opinsights.azure.com/AgentService.svc/AgentTopologyRequest
2022-12-11T08:07:15Z [Error] exception occurred : System.AggregateException: One or more errors occurred. (The SSL connection could not be established, see inner exception.)
---> System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception.
---> System.Security.Authentication.AuthenticationException: Authentication failed, see inner exception.
---> System.ComponentModel.Win32Exception (0x8009030D): The credentials supplied to the package were not recognized
at System.Net.SSPIWrapper.AcquireCredentialsHandle(ISSPIInterface secModule, String package, CredentialUse intent, SCHANNEL_CRED* scc)
at System.Net.Security.SslStreamPal.AcquireCredentialsHandle(CredentialUse credUsage, SCHANNEL_CRED* secureCredential)
at System.Net.Security.SslStreamPal.AcquireCredentialsHandleSchannelCred(SslStreamCertificateContext certificateContext, SslProtocols protocols, EncryptionPolicy policy, Boolean isServer)
at System.Net.Security.SslStreamPal.AcquireCredentialsHandle(SslStreamCertificateContext certificateContext, SslProtocols protocols, EncryptionPolicy policy, Boolean isServer)
--- End of inner exception stack trace ---
at System.Net.Security.SslStreamPal.AcquireCredentialsHandle(SslStreamCertificateContext certificateContext, SslProtocols protocols, EncryptionPolicy policy, Boolean isServer)
at System.Net.Security.SecureChannel.AcquireClientCredentials(Byte[]& thumbPrint)
at System.Net.Security.SecureChannel.GenerateToken(ReadOnlySpan1 inputBuffer, Byte[]& output) at System.Net.Security.SecureChannel.NextMessage(ReadOnlySpan1 incomingBuffer)
at System.Net.Security.SslStream.ProcessBlob(Int32 frameSize)
at System.Net.Security.SslStream.ReceiveBlobAsync[TIOAdapter](TIOAdapter adapter)
at System.Net.Security.SslStream.ForceAuthenticationAsync[TIOAdapter](TIOAdapter adapter, Boolean receiveFirst, Byte[] reAuthenticationData, Boolean isApm)
at System.Net.Http.ConnectHelper.EstablishSslConnectionAsync(SslClientAuthenticationOptions sslOptions, HttpRequestMessage request, Boolean async, Stream stream, CancellationToken cancellationToken)
--- End of inner exception stack trace ---
at System.Net.Http.ConnectHelper.EstablishSslConnectionAsync(SslClientAuthenticationOptions sslOptions, HttpRequestMessage request, Boolean async, Stream stream, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.CreateHttp11ConnectionAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.AddHttp11ConnectionAsync(HttpRequestMessage request)
at System.Threading.Tasks.TaskCompletionSourceWithCancellation1.WaitWithCancellationAsync(CancellationToken cancellationToken) at System.Net.Http.HttpConnectionPool.GetHttp11ConnectionAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken) at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken) at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken) at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken) --- End of inner exception stack trace --- at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions) at System.Threading.Tasks.Task1.GetResultCore(Boolean waitCompletionNotification)
at System.Threading.Tasks.Task`1.get_Result()
at SecureParking.Voyager.IoT.Edge.Jobs.Logging.Services.CertGenerator.RegisterWithOms(X509Certificate2 cert, String AgentGuid, String logAnalyticsWorkspaceDomainPrefixOms) in C:\Repo\IoT\src\IoTEdge\SecureParking.Voyager.IoT.Edge\functions\SecureParking.Voyager.IoT.Edge.Jobs\Logging\Services\CertGenertor.cs:line 385
at SecureParking.Voyager.IoT.Edge.Jobs.Logging.Services.CertGenerator.RegisterWithOmsWithBasicRetryAsync(X509Certificate2 cert, String AgentGuid, String logAnalyticsWorkspaceDomainPrefixOms) in C:\Repo\IoT\src\IoTEdge\SecureParking.Voyager.IoT.Edge\functions\SecureParking.Voyager.IoT.Edge.Jobs\Logging\Services\CertGenertor.cs:line 408

Error during deployment of ELMS solution

Hi everyone,

I tried to deploy the ELMS solution as described in the readme document:
link: https://github.com/Azure-Samples/iotedge-logging-and-monitoring-solution
steps to reproduce:
0- I opened power shell with elevated rights
1- I logged in with: az login command
2- I ran the deployment script
3- I choose a custom deployment
4- I choose an existing azure subscription, i choose existing resource group, existing IOT hub, existing storage, an existing log analytics work space, etc..
5-etc..

but when the deployment starts, it throws this following error, i shortened the output since its the same thing all over:

Creating resource group deployment.
ERROR: {"status":"Failed","error":{"code":"DeploymentFailed","message":"At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.","details":[{"message": "{\r\n \"error\": {\r\n \"code\": \"RoleAssignmentUpdateNotPermitted\",\r\n \"message\": \"Tenant ID, application ID, principal ID, and scope are not allowed to be updated.\"\r\n }\r\n}"\r\n }\r\n ]\r\n }\r\n ]\r\n }\r\n}"}]}}
Something went wrong with the resource group deployment. Ending script.
Au caractère C:\Users\Administrateur\Documents\iotedge-logging-and-monitoring-solution\Scripts\deploy.ps1:1012 : 9

  •     throw "Something went wrong with the resource group deploymen ...
    
  •     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    • CategoryInfo : OperationStopped: (Something went ... Ending script.:String) [], RuntimeException
    • FullyQualifiedErrorId : Something went wrong with the resource group deployment. Ending script.

this message clearly says: Tenant ID, application ID, principal ID, and scope are not allowed to be updated.

Has anyone have any solution for this ?
Link to the deployment script: https://github.com/Azure-Samples/iotedge-logging-and-monitoring-solution/blob/main/Scripts/deploy.ps1

Collect Metrics function stops working from time to time

This issue is for a: (mark with an x)

- [x ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  • Deploy monitoring architecture. Metrics Collector module sends metrics as D2C messages, which are routed to Event Hub, then Collect Metrics function is triggered and it sends data to Log Analytics workspace.

From time to time the Collect Metrics function is not able to successfully finish the execution. It keeps receiving events, but it doesn't finish processing. The problem is not visible in Azure Portal under Function -> Monitor -> Invocation, Error count is 0. And it looks like the function gets stuck. This can take hours, last time lasted more than 12 hours, which means that during 12 hours we didn't have Insights Metrics. And then suddenly, without any intervention from our side, it starts working again. This behavior happened multiple times, and in multiple environments.

Any log messages given by the failure

In Application Insights traces we are able to see the following two errors:

 exception occurred : System.AggregateException: One or more errors occurred. (The SSL connection could not be established, see inner exception.)
 ---> System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception.
 ---> System.ComponentModel.Win32Exception (0x8009030D): The credentials supplied to the package were not recognized
   at System.Net.SSPIWrapper.AcquireCredentialsHandle(SSPIInterface secModule, String package, CredentialUse intent, SCHANNEL_CRED scc)
   at System.Net.Security.SslStreamPal.AcquireCredentialsHandle(CredentialUse credUsage, SCHANNEL_CRED secureCredential)
   at System.Net.Security.SslStreamPal.AcquireCredentialsHandle(X509Certificate certificate, SslProtocols protocols, EncryptionPolicy policy, Boolean isServer)
   at System.Net.Security.SecureChannel.AcquireClientCredentials(Byte[]& thumbPrint)
   at System.Net.Security.SecureChannel.GenerateToken(Byte[] input, Int32 offset, Int32 count, Byte[]& output)
   at System.Net.Security.SecureChannel.NextMessage(Byte[] incoming, Int32 offset, Int32 count)
   at System.Net.Security.SslStream.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslStream.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslStream.StartReadFrame(Byte[] buffer, Int32 readBytes, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslStream.PartialFrameCallback(AsyncProtocolRequest asyncRequest)
--- End of stack trace from previous location where exception was thrown ---
   at System.Net.Security.SslStream.ThrowIfExceptional()
   at System.Net.Security.SslStream.InternalEndProcessAuthentication(LazyAsyncResult lazyResult)
   at System.Net.Security.SslStream.EndProcessAuthentication(IAsyncResult result)
   at System.Net.Security.SslStream.EndAuthenticateAsClient(IAsyncResult asyncResult)
   at System.Net.Security.SslStream.<>c.<AuthenticateAsClientAsync>b__65_1(IAsyncResult iar)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location where exception was thrown ---
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsyncCore(Stream stream, SslClientAuthenticationOptions sslOptions, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsyncCore(Stream stream, SslClientAuthenticationOptions sslOptions, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean allowHttp2, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.CreateHttp11ConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.GetHttpConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.DiagnosticsHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.FinishSendAsyncBuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
   --- End of inner exception stack trace ---
   at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
   at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
   at System.Threading.Tasks.Task`1.get_Result()
   at FunctionApp.CertificateGenerator.CertGenerator.RegisterWithOms(X509Certificate2 cert, String AgentGuid, String logAnalyticsWorkspaceDomainPrefixOms) in /home/runner/work/iotedge-logging-and-monitoring-solution/iotedge-logging-and-monitoring-solution/FunctionApp/FunctionApp/CertificateGenerator/CertGenertor.cs:line 381
   at FunctionApp.CertificateGenerator.CertGenerator.RegisterWithOmsWithBasicRetryAsync(X509Certificate2 cert, String AgentGuid, String logAnalyticsWorkspaceDomainPrefixOms) in /home/runner/work/iotedge-logging-and-monitoring-solution/iotedge-logging-and-monitoring-solution/FunctionApp/FunctionApp/CertificateGenerator/CertGenertor.cs:line 404
Registering agent with OMS failed (are the Log Analytics Workspace ID and Key correct?) : System.AggregateException: One or more errors occurred. (The SSL connection could not be established, see inner exception.)
 ---> System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception.
 ---> System.ComponentModel.Win32Exception (0x8009030D): The credentials supplied to the package were not recognized
   at System.Net.SSPIWrapper.AcquireCredentialsHandle(SSPIInterface secModule, String package, CredentialUse intent, SCHANNEL_CRED scc)
   at System.Net.Security.SslStreamPal.AcquireCredentialsHandle(CredentialUse credUsage, SCHANNEL_CRED secureCredential)
   at System.Net.Security.SslStreamPal.AcquireCredentialsHandle(X509Certificate certificate, SslProtocols protocols, EncryptionPolicy policy, Boolean isServer)
   at System.Net.Security.SecureChannel.AcquireClientCredentials(Byte[]& thumbPrint)
   at System.Net.Security.SecureChannel.GenerateToken(Byte[] input, Int32 offset, Int32 count, Byte[]& output)
   at System.Net.Security.SecureChannel.NextMessage(Byte[] incoming, Int32 offset, Int32 count)
   at System.Net.Security.SslStream.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslStream.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslStream.StartReadFrame(Byte[] buffer, Int32 readBytes, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslStream.PartialFrameCallback(AsyncProtocolRequest asyncRequest)
--- End of stack trace from previous location where exception was thrown ---
   at System.Net.Security.SslStream.ThrowIfExceptional()
   at System.Net.Security.SslStream.InternalEndProcessAuthentication(LazyAsyncResult lazyResult)
   at System.Net.Security.SslStream.EndProcessAuthentication(IAsyncResult result)
   at System.Net.Security.SslStream.EndAuthenticateAsClient(IAsyncResult asyncResult)
   at System.Net.Security.SslStream.<>c.<AuthenticateAsClientAsync>b__65_1(IAsyncResult iar)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location where exception was thrown ---
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsyncCore(Stream stream, SslClientAuthenticationOptions sslOptions, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsyncCore(Stream stream, SslClientAuthenticationOptions sslOptions, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean allowHttp2, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.CreateHttp11ConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.GetHttpConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.DiagnosticsHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.FinishSendAsyncBuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
   --- End of inner exception stack trace ---
   at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
   at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
   at System.Threading.Tasks.Task`1.get_Result()
   at FunctionApp.CertificateGenerator.CertGenerator.RegisterWithOms(X509Certificate2 cert, String AgentGuid, String logAnalyticsWorkspaceDomainPrefixOms) in /home/runner/work/iotedge-logging-and-monitoring-solution/iotedge-logging-and-monitoring-solution/FunctionApp/FunctionApp/CertificateGenerator/CertGenertor.cs:line 381
   at FunctionApp.CertificateGenerator.CertGenerator.RegisterWithOmsWithBasicRetryAsync(X509Certificate2 cert, String AgentGuid, String logAnalyticsWorkspaceDomainPrefixOms) in /home/runner/work/iotedge-logging-and-monitoring-solution/iotedge-logging-and-monitoring-solution/FunctionApp/FunctionApp/CertificateGenerator/CertGenertor.cs:line 404
   at FunctionApp.CertificateGenerator.CertGenerator.RegisterAgentWithOMS(String logAnalyticsWorkspaceDomainPrefixOms) in /home/runner/work/iotedge-logging-and-monitoring-solution/iotedge-logging-and-monitoring-solution/FunctionApp/FunctionApp/CertificateGenerator/CertGenertor.cs:line 462

The second message is misleading. Log Analytics Workspace ID and Key are correct, because eventually the function starts to work again, without any change.

Expected/desired behavior

  • CollectMetrics function is able to send Insights Metrics to Log Analytics workspace.

OS and Version?

  • Function App with Operating System: Windows, Runtime version: 3.2.0.0
  • last version of CollectMetrics function

Mention any other details that might be useful

Unable to Deploy Custom

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

git clone, run script select option 2 (custom deployment), choose own IoT Hub

Any log messages given by the failure

Using location 'northeurope' based on your IoT hub location

Creating resource group deployment.
ERROR: 'bytes' object has no attribute 'get'
Exception: /Users/lucarv/Documents/Code/repos/iotedge-logging-and-monitoring-solution/Scripts/deploy.ps1:1012

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)
mac monterrey

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

ERROR while deploying ELMS

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

I am trying to execute the deploy script as shown in the readme.

Any log messages given by the failure

PS C:\Users\sbhasale\iotedge-logging-and-monitoring-solution> .\Scripts\deploy.ps1

################################################
################################################

IoT Edge Logging & Monitoring Solution

################################################
################################################

Welcome to IoT ELMS (Edge Logging & Monitoring Solution). This deployment script will help you deploy IoT ELMS in your Azure subscription. It can be deployed as a sandbox environment, with a new IoT hub and a test IoT Edge device generating sample logs and collecting monitoring metrics, or it can connect to your existing IoT Hub and Log analytics workspace.

Press Enter to continue.

Verifying your Azure CLI installation version...

Great! You are using a supported Azure CLI version.

Retrieving your current Azure subscription...

You are currently using the Azure subscription 'AED E2E Experiences'. Do you want to keep using it?
1: Yes
2: No. I want to use a different subscription

1: 1

Choose a deployment option from the list (using its Index):
1: Create a sandbox environment for testing (fastest)
2: Custom deployment (most flexible)
3: Deploy Monitoring alerts (requires an existing ELMS deployment with metrics collection enabled)

: 1

Provide a name for the resource group to host all the new resources that will be deployed as part of your solution.

: elms-swapnil

Resource group 'elms-swapnil' does not exist. It will be created later in the deployment.

Choose a location for your deployment from this list (using its Index):
1: East US
2: West US
3: West Europe
4: North Europe
5: Southeast Asia
6: East Asia
7: Australia East
8: Australia Southeast
9: Japan West
10: Japan East
11: UK West
12: UK South
13: East US 2
14: Central US
15: West US 2
16: South Central US
17: Australia Central
18: Australia Central 2
19: France Central
20: France South
21: Canada East
22: Canada Central
23: Korea South
24: Korea Central
25: Central India
26: South India
27: Brazil South
28: West Central US
29: North Central US
30: East US 2 EUAP
31: Central US EUAP

: 1

Using location 'eastus'

Created new resource group elms-swapnil in eastus.

Creating resource group deployment.
ERROR: {"error":{"code":"InvalidDeploymentParameterValue","message":"The value of deployment parameter 'edgeVmSize' is null. Please specify the value or use the parameter reference. See https://aka.ms/resource-manager-parameter-files for details."}}
Something went wrong with the resource group deployment. Ending script.
At C:\Users\sbhasale\iotedge-logging-and-monitoring-solution\Scripts\deploy.ps1:1012 char:9

  •     throw "Something went wrong with the resource group deploymen ...
    
  •     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    • CategoryInfo : OperationStopped: (Something went ... Ending script.:String) [], RuntimeException
    • FullyQualifiedErrorId : Something went wrong with the resource group deployment. Ending script.

Expected/desired behavior

The deployment script should run successfully

OS and Version?

Windows 10

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Support uploading logs from edge devices behind gateway

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

It seems that currently version of ELMS only supports retrieving logs from top-layer edge devices. InvokeUploadModuleLogs function invokes UploadModuleLogs with SasUrl of azure blob storage's public endpoint. According to this link to retrieve logs from devices behind gateway it should use something like "https://$upstream:8000/myBlobStorageName/myContainerName?SAS_key" instead.

It would be great if InvokeUploadModuleLogs could identify edge devices behind gateway (e.g., having a parent device) and generates appropriate sasUrl for them.

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Add Resilience to Log Uploads

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Break edgeAgent connectivity to blob storage endpoint for storing log files.
Trigger Schedule Module Logs func app which triggers Invoke Module Logs Uploads func app.
Bridge edgeAgent connectivity to blob storage endpoint back.
Let Schedule Module Logs trigger again to upload logs.
Observe gap in uploaded logs due to first run being errored because of connectivity issue induced (which can happen in real life).

Any log messages given by the failure

Errors are not surfaced to the solution.

Expected/desired behavior

Error in uploading log files is considered (by checking status via GetTaskStatus direct method) by the Schedule Module Logs func before issuing subsequent log uploads, to ensure no gaps in log uploads occur and any upload errors are surfaced to the user/app.

Mention any other details that might be useful

If we agree this is a useful feature, I'll look into implementing it.

re-deployment failed

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

I deployed the solution as per the instruction for my nested IOT edge environment and I was not able to make the workbook view working so I deleted the resources and tried to redeploy the solution ( used custom option) , but now I'm unable to complete the deployment receiving the role assignment not permitted

Any log messages given by the failure

" Creating resource group deployment.
ERROR: {"status":"Failed","error":{"code":"DeploymentFailed","message":"At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.","details":[{"code":"BadRequest","message":"{\r\n "error": {\r\n "code": "RoleAssignmentUpdateNotPermitted",\r\n "message": "Tenant ID, application ID, principal ID, and scope are not allowed to be updated."\r\n }\r\n}"},{"code":"BadRequest","message":"{\r\n "error": {\r\n "code": "RoleAssignmentUpdateNotPermitted",\r\n "message": "Tenant ID, application ID, principal ID, and scope are not allowed to be updated."\r\n }\r\n}"},{"code":"BadRequest","message":"{\r\n "error": {\r\n "code": "RoleAssignmentUpdateNotPermitted",\r\n "message": "Tenant ID, application ID, principal ID, and scope are not allowed to be updated."\r\n }\r\n}"},{"code":"BadRequest","message":"{\r\n "error": {\r\n "code": "RoleAssignmentUpdateNotPermitted",\r\n "message": "Tenant ID, application ID, principal ID, and scope are not allowed to be updated."\r\n }\r\n}"}]}}
Something went wrong with the resource group deployment. Ending script.
At C:\iotPlatform\IoT-Platform\logging-monitoring\iotedge-logging-and-monitoring-solution\Scripts\deploy.ps1:1012 char:9

  •     throw "Something went wrong with the resource group deploymen ...
    
  •     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    • CategoryInfo : OperationStopped: (Something went ... Ending script.:String) [], RuntimeException
    • FullyQualifiedErrorId : Something went wrong with the resource group deployment. Ending script.

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)
Powershell

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Uploading Messages via IoTMessage fails sometimes

This issue is for a: (mark with an x)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

When deploying the IoTMetricsCollector via IoTHub, you can set the environment variable UploadTarget = IoTMessage .

It worked for 12 hours after deployment then stopped. Worked 10 mins again during the night and then stopped working. Changed deployment to UploadTarget = AzureMetrics (works) and then reverted to UploadTarget = IoTMessage. I did this a few times. Reverting to UploadTarget = IoTMessage works only sometimes, not always

Any log messages given by the failure

N/A

OS and Version?

Linux Ubuntu Server 18

Versions

edgeHub and edgeAgent 1.2.3
IoTEdgeMetricsCollector 1.0

I Have no specific logs that might help. It seems that the device, log analytics and function app work, but it is then not possible to visualize the data on the IoTHub Workbook.

Log message from modules getting split into multiple line items, when submitted to log analytics

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

just deploy the solution by using the powershell script
after the deployment is successful, analyze the logs in log analytics.
Every line item from a single message comes as a separate log in log analytics. For example, if the log message is -
{
"id": "someId",
"uri": "rtsp://rtspsim:554/media/sample.mkv",
....
}

The above log message will be split into 4 line items in log analytics iotedgemodulelogs_CL table. 1st line contains the opening curly brace, the second line contains the id and so on.

Any log messages given by the failure

No

Expected/desired behavior

Expect behavior is, for a log message spanning across multiple line items in the "Message", one log entry should be created in log analytics.

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?) The modules are deployed on a simulated stack edge device (Linux (ubuntu 18.04))

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Allow configuration of the log pulling interval

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Pulling logs is partially configurable. There is a parameter LogsSince, that can be configured in the Function App settings, but the ScheduleUploadModuleLogs function has a timer trigger, every 15min, and this value is hardcoded. So, if someone changes LogsSince, but will not update the ScheduleUploadModuleLogs timer trigger, then it will result in logs gaps or logs duplication. It is a bit dangerous to have this value hardcoded.

Expected/desired behavior

Allow to configure ScheduleUploadModuleLogs timer trigger (for example during the deployment, or through the Function App setting, if possible).

Mention any other details that might be useful

It would be good to make sure that the same value for LogsSince and the timer trigger for the ScheduleUploadModuleLogs function is used (if I understand correctly, then there won't be any log gaps or duplication).

Provide alerts for monitoring the ELMS solution

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  • If there is any problem with the ELMS, for example Edge device gets disconnected, metrics and logs will stop flowing. And there is no way to be automatically notified about it.

Expected/desired behavior

  • Alerts should be created to detect that the ELMS is not reporting logs or metrics.

Mention any other details that might be useful

  • The alerts could be based on the last time when InsightsMetrics or logs were reported. If that value is significantly greater than pulling interval, then the alert should be triggered.
  • It would be also good to have a dashboard to help understanding potential problems with the ELMS

Script fails on mixing cli with powershell commands

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Attempt tp run the script. It fails under powershell 7 and runs under Powershell 5.1.22000.65

Any log messages given by the failure

ERROR: AADSTS700082: The refresh token has expired due to inactivity. The token was issued on 2021-03-02T18:40:35.2776433Z and was inactive for 90.00:00:00.
Trace ID: 8bc2c1db-3176-485e-961b-de2800cb3d01
Correlation ID: cedce3c5-a206-4ea4-882c-13e817f24086
Timestamp: 2021-10-11 23:47:30Z
To re-authenticate, please run az login. If the problem persists, please contact your tenant administrator.

Expected/desired behavior

supposed to be able to select from resources - each time this goes to the az cli, it blows up with errors.
Choose a location for your deployment from this list (using its Index):

:

Choose from the list using an index between 1 and 0.

: 1

Choose from the list using an index between 1 and 0.

:

If I modify the script like so to use powershell then it works properly but the cli piping is all over this script

    $script:resource_group_name = Read-Host -Prompt ">"
    $resourceGroup=Get-AzResourceGroup -Name $script:resource_group_name -ErrorAction SilentlyContinue
    # Fails to follow authz context
    # $resourceGroup = az group list | ConvertFrom-Json | Where-Object { $_.name -eq $script:resource_group_name }
    if (!$resourceGroup.ResourceGroupName -eq $script:resource_group_name) {
        $script:create_resource_group = $true
    }
    else {
        $script:create_resource_group = $false
    }
}

}

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)
Windows 11 - running under Terminal with 2 versions of powershell

Versions

Mention any other details that might be useful

I tried to use "az account set --subscription mysubscriptionname" to see if I could focus it on my target sub while debugging in vscode but that didnt work.


Thanks! We'll be in touch soon.

Deployment script fails in sandbox option after resources are created

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Run .\Scripts\deploy.ps1
Select sandbox deployment option

Any log messages given by the failure

##############################################
##############################################

Deployment Succeeded

##############################################
##############################################

Set-Variable: /home/marvin/iotedge-logging-and-monitoring-solution/Scripts/deploy.ps1:1308
Line |
1308 | Set-Variable -Name $deployment_parameters -Value $output_parameters - …
| ~~~~~~~~~~~~~~~~~~~~~~
| Cannot bind argument to parameter 'Name' because it is null.

Expected/desired behavior

No error message should appear

OS and Version?

Azure Cloud Shell

Error during Azure Function deployment

Please provide us with the following information:

This issue is for a: (mark with an x)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

I created a nested IoT Edge hierarchy following
https://github.com/Azure-Samples/iot-edge-for-iiot/blob/master/1-SimulatePurdueNetwork.md
and
https://github.com/Azure-Samples/iot-edge-for-iiot/blob/master/2-DeployOfflineDashboard.md
then run the script
.\Scripts\deploy.ps1
choosing the CloudWorkflow option
and
Custom deployment as a deployment option
An existing resource group (the one used for the nested edges)
An existing IoT Hub (the one used for the nested edges)
Enable IoT Edge monitoring
Upload metrics as IoT messages

I had this error during the Azure Function deployment

Deploying code to Function App iotedgelogsapp-91b304a7
WARNING: Getting scm site credentials for zip deployment
WARNING: Starting zip deployment. This operation can take a while to complete ...
WARNING: Deployment endpoint responded with status code 202
WARNING: Configuring default logging for the app, if not already enabled
ERROR: Zip deployment failed. {'id': '9f97537381e54130bbb9fd4f57def1cc', 'status': 3, 'status_text': '', 'author_email': 'N/A', 'author': 'N/A', 'deployer': 'ZipDeploy', 'message': 'Created via a push deployment', 'progress': '', 'received_time': '2022-06-01T14:21:27.3115273Z', 'start_time': '2022-06-01T14:21:27.3115273Z', 'end_time': '2022-06-01T14:21:29.0770729Z', 'last_success_end_time': None, 'complete': True, 'active': False, 'is_temp': False, 'is_readonly': True, 'url': 'https://iotedgelogsapp-91b304a7.scm.azurewebsites.net/api/deployments/latest', 'log_url': 'https://iotedgelogsapp-91b304a7.scm.azurewebsites.net/api/deployments/latest/log', 'site_name': 'iotedgelogsapp-91b304a7', 'provisioningState': 'Failed'}. Please run the command az webapp log deployment show -n iotedgelogsapp-91b304a7 -g testNestedEdge

Any log messages given by the failure

From 'https://iotedgelogsapp-91b304a7.scm.azurewebsites.net/api/deployments/latest/log'
{"Message":"An error has occurred.","ExceptionMessage":"No log found for 'latest'.","ExceptionType":"System.IO.FileNotFoundException","StackTrace":" at Kudu.Core.Deployment.DeploymentManager.GetLogEntries(String id) in C:\Kudu Files\Private\src\master\Kudu.Core\Deployment\DeploymentManager.cs:line 98\r\n at Kudu.Services.Deployment.DeploymentController.GetLogEntry(String id) in C:\Kudu Files\Private\src\master\Kudu.Services\Deployment\DeploymentController.cs:line 376"}

running “az webapp log deployment show -n iotedgelogsapp-91b304a7 -g testNestedEdge”
[
{
"details_url": null,
"id": "ba1db209-176b-48b3-9073-b672026c3ce7",
"log_time": "2022-06-01T14:21:27.7215351Z",
"message": "Updating submodules.",
"type": 0
},
{
"details_url": null,
"id": "ef034d16-b2a8-4338-9e24-0c35dace3878",
"log_time": "2022-06-01T14:21:28.0146338Z",
"message": "Preparing deployment for commit id '9f97537381'.",
"type": 0
},
{
"details_url": null,
"id": "972d4b56-9ea3-487e-b852-9888b6db3973",
"log_time": "2022-06-01T14:21:28.9226745Z",
"message": "Deployment Failed.",
"type": 0
}
]

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Improve log levels filtering

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Log levels filtering could be improved. There is a parameter LogsLogLevel. But it allows to only retrieve logs that match exactly the specified log level.

Expected/desired behavior

It would be better if we could retrieve the level we specify and all smaller levels. For example, LogsLogLevel = 4 would return logs with level 4 and 3,2,1,0. It is a rather weird case to only get logs of a single level.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.