Git Product home page Git Product logo

Comments (16)

yantang-msft avatar yantang-msft commented on June 14, 2024

@AlkisFortuneFish did you add ServiceRemotingRequestTrackingTelemetryModule and ServiceRemotingDependencyTrackingTelemetryModule to all services? i.e, the API Service, Service A and Service B.
Here is an example project that might help with the initialization: https://github.com/yantang-msft/service-fabric-application-insights-example

from applicationinsights-servicefabric.

AlkisFortuneFish avatar AlkisFortuneFish commented on June 14, 2024

Yes, as is visible in the initialisation code I pasted above. The example cannot be copied directly because it uses TelemetryConfiguration.Active, which is deprecated, hence the explicit initialisation of the telemetry configuration.

from applicationinsights-servicefabric.

AlkisFortuneFish avatar AlkisFortuneFish commented on June 14, 2024

Originally I had the OperationCorrelationTelemetryInitializer there too, and it does appear to have been lost in the "throwing things at the wall to see what sticks" phase of trying to fix this. I've just re-added it to see if it makes any difference to the outcome, from memory, the issue was there with that too.

from applicationinsights-servicefabric.

yantang-msft avatar yantang-msft commented on June 14, 2024

If the 2 modules has been added to all services, then I can't think of why it doesn't work for you.
You mean the Service Remoting call is detected as a dependency telemetry, it's just that it can't be correlated correctly, isn't it? This is usually means there is only the dependency telemetry but not matching request telemetry. You can use the transaction view to see if that's the case.

from applicationinsights-servicefabric.

AlkisFortuneFish avatar AlkisFortuneFish commented on June 14, 2024

Looking at the data here, I'm a bit confused:
Request dd2ed206271c0a49 POST GetPendingCompetitionRewards/Post operationId == operation_ParentId == f5b34abd513e174b8fedeb7fde937574
Dependency 87495a9ca4af5441 GetPendingCompetitionRewards operationId == f5b34abd513e174b8fedeb7fde937574 && operation_ParentId == dd2ed206271c0a49 - as expected
Backend: 17be469bd121cf4a GetPendingCompetitionRewards - operation_ParentId == 87495a9ca4af5441 && operationId == aeb475dae1369a4ba0538b3386dc70b4

Request:
timestamp [UTC] 2021-06-08T10:26:08.4407244Z
id dd2ed206271c0a49
name POST GetPendingCompetitionRewards/Post
url https://devcluster.fortunefish.co.uk:8530/api/GetPendingCompetitionRewards
success True
resultCode 200
duration 6.5316
performanceBucket <250ms
itemType request
customDimensions {"_MS.ProcessedByMetricExtractors":"(Name:'Requests', Ver:'1.1')","AspNetCoreEnvironment":"Production","ServiceFabric.ApplicationTypeName":"FashionARServerType","ServiceFabric.ServiceTypeName":"APIFrontendType","ServiceFabric.ApplicationName":"fabric:/FashionARServer","ServiceFabric.ServiceName":"fabric:/FashionARServer/APIFrontend","ServiceFabric.PartitionId":"ef37365d-3be4-4d47-b0bc-3b6150f49d68","ServiceFabric.NodeName":"_nt1vm_1","ServiceFabric.InstanceId":"132676178159853282"}
operation_Name POST GetPendingCompetitionRewards/Post
operation_Id f5b34abd513e174b8fedeb7fde937574 
operation_ParentId f5b34abd513e174b8fedeb7fde937574
application_Version 9.9.23
client_Type PC
client_IP 0.0.0.0
client_CountryOrRegion United Kingdom
cloud_RoleName fabric:/FashionARServer/APIFrontend
cloud_RoleInstance 132676178159853282
appId d08126b9-7b82-4a9a-a4c0-1322a5105100
appName DevClusterAppInsights
iKey ef4d71c4-301a-440c-ad14-a08b7a2a2965
sdkVersion aspnet5c:2.17.0+c9d95e701e2474b7eb3b46ae7953b6c7570356ab
itemId fc00b189-c843-11eb-8848-cdd4efeaa105
itemCount 4
Dependency:
timestamp [UTC] 2021-06-08T10:26:08.4433374Z
id 87495a9ca4af5441
target fabric:/FashionARServer/UserActorService
type ServiceFabricServiceRemoting
name GetPendingCompetitionRewards
data fabric:/FashionARServer/UserActorService/GetPendingCompetitionRewards
success True
duration 2
performanceBucket <250ms
itemType dependency
operation_Name GetPendingCompetitionRewards
operation_Id f5b34abd513e174b8fedeb7fde937574
operation_ParentId dd2ed206271c0a49
client_Type PC
client_IP 0.0.0.0
client_City Cardiff
client_StateOrProvince Cardiff
client_CountryOrRegion United Kingdom
cloud_RoleInstance nt1vm000001
appId d08126b9-7b82-4a9a-a4c0-1322a5105100
appName DevClusterAppInsights
iKey ef4d71c4-301a-440c-ad14-a08b7a2a2965
sdkVersion rddsr:2.3.1-140
itemId fc9e0301-c843-11eb-8663-f7253130639b
itemCount 1
Backend Call:	
timestamp [UTC] 2021-06-08T10:26:08.4446195Z
id 17be469bd121cf4a
name GetPendingCompetitionRewards
success True
resultCode Not Applicable
duration 0.1078
performanceBucket <250ms
itemType request
customDimensions {"ServiceFabric.ApplicationTypeName":"FashionARServerType","ServiceFabric.ServiceTypeName":"UserActorServiceType","ServiceFabric.ApplicationName":"fabric:/FashionARServer","ServiceFabric.ServiceName":"fabric:/FashionARServer/UserActorService","ServiceFabric.PartitionId":"2796a814-b606-4381-ab1c-405152522cd7","ai_legacyRootId":"87495a9ca4af5441","ServiceFabric.NodeName":"_nt1vm_0","ServiceFabric.ReplicaId":"132645366371879734"}
operation_Name GetPendingCompetitionRewards
operation_Id aeb475dae1369a4ba0538b3386dc70b4
operation_ParentId 87495a9ca4af5441
application_Version 9.9.23
client_Type PC
client_IP 0.0.0.0
client_City Cardiff
client_StateOrProvince Cardiff
client_CountryOrRegion United Kingdom
cloud_RoleName fabric:/FashionARServer/UserActorService
cloud_RoleInstance 132645366371879734
appId d08126b9-7b82-4a9a-a4c0-1322a5105100
appName DevClusterAppInsights
iKey ef4d71c4-301a-440c-ad14-a08b7a2a2965
sdkVersion serviceremoting:2.3.1-140
itemId f6de2618-c843-11eb-b802-492c4f3bff35
itemCount 1

So it would appear that the backend call is correctly annotated with an operation_ParentId that matches the remoting call that triggerred it but it has an entirely separate operation_Id.

What could cause that?

from applicationinsights-servicefabric.

yantang-msft avatar yantang-msft commented on June 14, 2024

The telemetry operationId definitely needs to be the same for correlation.
Make sure you didn't touch anything regarding System.Diagnostics.Activity (you can touch but if it's not done correctly then the correlation will be broken).
Also, I noticed the "Backend call" is from an actor service. The initialization of Actor Service has some trickiness, please check out this example project, hopefully that helps: https://github.com/yantang-msft/service-fabric-application-insights-example

from applicationinsights-servicefabric.

AlkisFortuneFish avatar AlkisFortuneFish commented on June 14, 2024

The initialisation I pasted above is from one of the ActorServices. All services have the initialisation I pasted above (which I updated just now). I am not sure what trickiness you are referring to, as far as I can see it is the same as initialisation of any other non-ASP.NET SF service, we already have explicitly derived ActorService classes for all our actor services for other reasons anyway.

I am not sure why you keep pasting the example, I have seen the example, that is where I based my integration on, which you can clearly see if you read the code I pasted above.

from applicationinsights-servicefabric.

yantang-msft avatar yantang-msft commented on June 14, 2024

The trickiness for Actor is that you need to initialize the code in ActorBackendService, not in Program.cs or ActorBackend.cs. If you are already doing this and didn't touch any System.Diagnostics.Activity, then I have no idea why the operationId changed.

from applicationinsights-servicefabric.

AlkisFortuneFish avatar AlkisFortuneFish commented on June 14, 2024

Yeah, we're not touching System.Diagnostics.Activity anywhere as far as I can see. We need this sorted however, so what are the next steps to debug it? I have been directed here from a support ticket, so I'm not sure where I could take it from here otherwise.

from applicationinsights-servicefabric.

yantang-msft avatar yantang-msft commented on June 14, 2024

You can generate the build yourself and try debug.
This is where the "dependency" telemetry is generated, and this is where the request telemetry is captured by "Backend call".
You can check System.Diagnostics.Activity and see whether the operationId is consistent/altered.

from applicationinsights-servicefabric.

AlkisFortuneFish avatar AlkisFortuneFish commented on June 14, 2024

Now I've had a bit of time allocated to look into this and managed to make the remote debugger work for the first time ever (why on earth is it considered acceptable not to be able to debug if you have multiple instances of the same SF application on the cluster and to silently fail if you do try!?), I have some findings.

The activity Id is correctly passed from caller to receiver, the ParentId is deserialised and set.
The Request-Context is set.
The I don't see a correlation context being set on the headers, there is no Baggage on the activity, I don't know if there should be.

I am not hugely familiar with the overall pipeline, so any guidance as to what to look at in particular?

from applicationinsights-servicefabric.

AlkisFortuneFish avatar AlkisFortuneFish commented on June 14, 2024

Poke. If you do not have the time to look into this let me know to re-engage with the support request and get some Microsoft resources that way, I just need to know.

from applicationinsights-servicefabric.

yantang-msft avatar yantang-msft commented on June 14, 2024

@AlkisFortuneFish so the operation_parentId is set, but is it set correctly to match the id of the parent? If not, you can reengage the support or file issue under this repo: https://github.com/microsoft/ApplicationInsights-dotnet
The OperationContext is set when StartOperation gets called here, and StartOperation is implemented in this repo: https://github.com/microsoft/ApplicationInsights-dotnet

from applicationinsights-servicefabric.

AlkisFortuneFish avatar AlkisFortuneFish commented on June 14, 2024

Ok, I think I've got to the bottom of this. There are a few different issues here, which is what made this difficult to track down.

  1. My initialization on the ASPNetCore side is not correct. Creating a TelemetryConfiguration, feeding it into the tracking module initialization and then calling AddApplicationInsightsTelemetry() causes a disconnect between request tracking and dependency tracking. I had to change the initialisation on that end to look more like the original example (albeit with the updated APIs).
  2. The Ids generated are not in the W3C format. The change of the default Id format to W3C makes the current (2.17.0) version of StartOperation() create a new Id and assign the old one to a legacy property. This does not work. I had to force Diagnostics.Activity.DefaultIdFormat back to Hierarchical for the Ids not to not change.
  3. TelemetryConfiguration.CreateDefault() now already adds the OperationCorrelationTelemetryInitializer, so creating a new one (as per the example) actually breaks correlation altogether.

So the fundamental issue is that this package, documentation, and examples are due an update to actually work with the latest ApplicationInsights package and .Net Core. I will update our integration to force the old format but I do not know if there are any plans to drop support for Hierarchical at some point, breaking the integration again in future.

from applicationinsights-servicefabric.

yantang-msft avatar yantang-msft commented on June 14, 2024

I'm glad that you found where it went wrong and get it fixed.
BTW, for # 2, I would imagine the id format is gracefully handled. i.e., the operationId/operation_ParentId are all in new/old format and can be correlated correctly
for # 3, I think the TelemetryInitializer precedence strategy is first one wins, so # 3 might not be a real problem.

Anyways, this project is in maintenance mode, if the old format is a real problem and will be dropped someday, then this library won't work.

from applicationinsights-servicefabric.

AlkisFortuneFish avatar AlkisFortuneFish commented on June 14, 2024

I'd have hoped that it's gracefully handled, as the documentation would suggest, but it does not appear to be entirely the case, if the Id format is W3C and the incoming Id is not, StartOperation in 2.17.0 generates a new Id instead of propagating what was passed in.
Number 3 doesn't sound like it should be problematic but I did test it and it is. I explicitly tested with and without the extra initialiser and with it correlation was broken. I did not have the time to check exactly what happened to the telemetry as it was nearly midnight at that point.

from applicationinsights-servicefabric.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.