ghorsey / opengraph-net Goto Github PK
View Code? Open in Web Editor NEW.Net Open Graph Parser written in C#
Home Page: https://ghorsey.github.io/OpenGraph-Net/
License: MIT License
.Net Open Graph Parser written in C#
Home Page: https://ghorsey.github.io/OpenGraph-Net/
License: MIT License
OpenGraph-Net/src/OpenGraphNet/OpenGraph.cs
Line 237 in 6518559
Everytime method is called new HttpDownloader in instantantiated and the GetPageAsync() is called.
This method creates the HttpClient per call. Due to Microsoft docs this is not the best practice.
I think new signature for ParseUrlAsync()
is needed, that would have an HttpClient
or IHttpClientFactory
as parameter.
Edge case issue:
Maybe this already exists but I didn't see it in the code.
Are you doing any security checking of fields being parsed? Should developers being doing their own security checks of parsed data?
For example, if a "title", "url", or any field is being pulled from third party website, with a XSS attack payload within it, and that field is then being injected into the HTML, is this going to cause an XSS exploit?
Thanks for the library. It's very useful.
Hello!
I'm trying to get title
, description
and image
using OpenGraph-Net but fail for some reason I don't understand. Could you help me?
For example I tried to get data for https://google.com. You can see the code below.
Below you can see what the graph
object looks like after the invocation.
As you can see, I'm missing all the data I need. And as I understand it, I should get something.
If I try to get data for https://youtube.com instead, I only get the image
, whereas title
and description
remain the same.
But as for https://yandex.com it is different. I can get all the information I need.
I am using OpenGraph-Net of v.3.2.4 in ASP.NET Core WebApi application (.NET Core 2.1). Locally I run application on Win 10. The problem persists in staging environment which is on Windows Server 2019.
Any help is much appreciated! Thanks in advance!
Is your feature request related to a problem? Please describe.
I have a problem where I'm retrieving a list of open-graph responses in an API. If one site is down it breaks my whole API; I'd like to be able to provide a timeout so that if it takes more than a few seconds to provide a response I can move on to the next one in the list.
Describe the solution you'd like
An overload, e.g. of OpenGraph.ParseUrlAsync(url, timeout);
might be nice.
Please add 2.0 support, I can't use it in older projects. 2.1 is now redundant.
When I try to get the meta data from a youtube video it is empty. It has worked before, but suddenly stopped working. Other url's works as promised.
If I render the html from "OriginalHtml" it returns this:
As said YouTube used to work, but now it doesn't - both on localhost or on the live server. Have both my IP's been blocked? Have you ever experienced that and do you have any suggestions on how to fix it?
Thanks!
Just had an issue with character encoding for og:title and og:description with that URL : http://www.telerama.fr/cinema/realite-virtuelle-360-de-bonheur-a-ameliorer,144339.php?utm_medium=Social&utm_source=Twitter&utm_campaign=Echobox&utm_term=Autofeed#link_time=1466595239
If it can help - I have fixed it on my fork here :
sociabble@321ce50
Is there any roadmap date for adding support for .netcore in near future?
Describe the bug
Not a bug technically speaking but I'm using OpenGraph 3.2.3 in a .NET Core 2.2 project, works well while on the dev pcs (Win2012, VisualStudio 20147). However, when published on our web server, it fails to find the DLL.
2020-06-17 17:08:43.904 +02:00 [ERR] b8d479ce-f635-4fd9-a4e3-87892fdaf998 Something went wrong: System.IO.FileNotFoundException: Could not load file or assembly 'OpenGraphNet, Version=3.0.0.0, Culture=neutral, PublicKeyToken=null'. The system cannot find the file specified. File name: 'OpenGraphNet, Version=3.0.0.0, Culture=neutral, PublicKeyToken=null' at FlairAPI.Controllers.v1.BlackListController.GetDataFromUrl(String base64Url) at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine) at FlairAPI.Controllers.v1.BlackListController.GetDataFromUrl(String base64Url) at lambda_method(Closure , Object , Object[] ) at Microsoft.Extensions.Internal.ObjectMethodExecutor.Execute(Object target, Object[] parameters) at Microsoft.AspNetCore.Mvc.Internal.ActionMethodExecutor.TaskOfIActionResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments) at System.Threading.Tasks.ValueTask
1.get_Result()
at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeActionMethodAsync()
at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeNextActionFilterAsync()
at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.Rethrow(ActionExecutedContext context)
at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)
at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeInnerFilterAsync()
at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeNextResourceFilter()
at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.Rethrow(ResourceExecutedContext context)
at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)
at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeFilterPipelineAsync()
at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeAsync()
at Microsoft.AspNetCore.Builder.RouterMiddleware.Invoke(HttpContext httpContext)
at Microsoft.AspNetCore.StaticFiles.StaticFileMiddleware.Invoke(HttpContext context)
at Swashbuckle.AspNetCore.SwaggerUI.SwaggerUIMiddleware.Invoke(HttpContext httpContext)
at Swashbuckle.AspNetCore.Swagger.SwaggerMiddleware.Invoke(HttpContext httpContext, ISwaggerProvider swaggerProvider)
at Microsoft.AspNetCore.Cors.Infrastructure.CorsMiddleware.InvokeCore(HttpContext context)
at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
at Microsoft.AspNetCore.StaticFiles.StaticFileMiddleware.Invoke(HttpContext context)
at FlairAPI.Middlewares.GlobalExceptionMiddleware.InvokeAsync(HttpContext httpContext)
`
I checked on de deployment folder and the DLL is there, as well as the HtmlAgilityPack.dll
Any idea ? If this is the wrong place for this kind of question, feel free to close this.
Hello,
We are using you package (V3.2.6) to successfully get open graph from user entered URL.
With the following URL, it doesn't work.
With Curl, I make it work by adding a user agent from Chrome.
curl --user-agent 'Mozilla/5.0 (Linux; Android 10) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.88 Mobile Safari/537.36' 'https://www.nasdaq.com/boardvantage/board-portal?utm_medium=ppc@utm_source=google&utm_term=boardroom%20software&gclid=Cj0KCQjwh_eFBhDZARIsALHjlKf3nzPwT2d6QCIYjHbN5UVrpwJvD0gixyDMuUI66RdSpWjfCzEqoT8aAkXfEALw_wcB' | head -20
Even if I add a user agent in the call to
graph = await OpenGraph.ParseUrlAsync(url,userAgent);
it still doesn't work
The user agent is the one from the calling browser or
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"
by default.
Can you help me?
Regards,
Michael
Describe the bug
var graph = await OpenGraph.ParseUrlAsync(myUrl);
var description = graph.Description;
the graph
object does not have a property for Description
...
OpenGraph-Net v3.2.4 from Nuget
Is your feature request related to a problem? Please describe.
When requests return an HTTP redirects like 301 status codes, ParseUrlAsync
throws an exception
Describe the solution you'd like
set AllowAutoRedirect
to true on HttpWebRequest
Describe the bug
OpenGraph fetching thows where the target url return a 301 status code
To Reproduce
Steps to reproduce the behavior:
await OpenGraph.ParseUrlAsync("https://news.google.com/__i/rss/rd/articles/CBMigwFodHRwczovL3d3dy5zdWRpbmZvLmJlL2lkMzEzNzM4L2FydGljbGUvMjAyMS0wMS0yMi9sZS1jb21pdGUtZGUtY29uY2VydGF0aW9uLWRlYnV0ZS1kZS1ub3V2ZWxsZXMtbWVzdXJlcy1wbHVzLXJlc3RyaWN0aXZlcy1xdWlkLWRlc9IBAA?oc=5")
If this url does not work in the future, this kind of url come from : https://news.google.com/news/rss/?ned=fr_be&gl=BE&hl=fr
Expected behavior
I would have expected it fetch the OpenGraph located at :
Desktop (please complete the following information):
<PackageReference Include="OpenGraph-Net" Version="3.2.4" />
Describe the bug
The call to GetPageAsync in HttpDownLoader throws an exception when I try to retrieve https://marketplace.visualstudio.com/items?itemName=sdras.vue-vscode-extensionpack
. It throws a NullReference exception on line 146 when executing if (response.ContentEncoding.ToLower().Contains("gzip"))
.
To Reproduce
HttpDownloader downloader = new HttpDownloader("https://marketplace.visualstudio.com/items?itemName=sdras.vue-vscode-extensionpack", "test", "test");
string html = await downloader.GetPageAsync();
Expected behavior
Not to throw.
Desktop (please complete the following information):
Would you mind providing a property that would provide the HTML response string or HtmlDocument object. The reason is, there is other metadata I'd like to pull from the response and it would be nice to not have to make another request to get it. It would be convenient to inspect the original string.
Would you be open to adding a scheme check at the beginning of your ParseUrl methods that accept strings? This is more of a convenience/usability enhancement so we don't have to keep checking for schemes in our code
if (!Regex.IsMatch(url, @"^https?:\/\/", RegexOptions.IgnoreCase))
url = "http://" + url;
Hello!
I've encountered a problem with the method ParseUrl
. When I provide a URL, which leads to a not existing server, for which web browser would show This site can't be reached
, ParseUrl
method throw NullReferenceException
with message Object reference not set to an instance of an object
.
Expected behavior
In my opinion, the expected behavior is to maybe throw HttpRequestException
with the message Response status code does not indicate success: 400 (Bad Request).
, because with NullReferenceException
there is no information what happened inside when parsing URL.
Desktop (please complete the following information)
OS: macOS
.Net 5.0
Version : <PackageReference Include="OpenGraph-Net" Version="3.2.6" />
I ran into an issue with some website having og:image "html encoded".
For instance
https://www.periscope.tv/w/1DXxyZZZVykKM
=> image when parsing is not visible...
=> just did this trick to avoid that issue (OpenGraph.cs) :
var theVal = (property ?? "").Equals("image", StringComparison.InvariantCultureIgnoreCase)
? System.Web.HttpUtility.HtmlDecode((value ?? ""))
: value;
result.openGraphData.Add(property, theVal);
Let me know if this is a good thing and if you think this could help for other cases (other websites doing that).
When parsing page tags I get an exception "An item with the same key has already been added". Is there a way to handle this gracefully or determine the key that is being inserted?
using nuget 1.2.0.1
-- For instance :
var ogTags = OpenGraph.ParseUrl("https://vk.com/wall-41600377_66756");
Console.WriteLine(ogTags["description"]);
=> encoding is not correct.
It seams github source code is correct, is it that nuget.org is not up to date ? Can you please push a newer version on nuget ?
Thanks.
When importing a URL from Github (example) The title/other data of the page is not recognized well.
And will always result in "The request was aborted: Could not create SSL/TLS secure channel. "
As described in this pull request #2 from @DamianMac. Replace the html parsing code with a strategy that can be replaced by consumers of the OpenGraph.net
Hi,
love the library, thank you. This is a minor request, it'd be nice if the package Microsoft.CodeAnalysis.FxCopAnalyzers
would be set to private like StyleCop.Analyzers
is.
otherwise this is a great library 👍
Describe the bug
I was trying to add OpenGraph-NET to my Xamarin.Forms app to do some parsing of OG data. I added the latest NuGet to my cross platform project, which is a .NET Standard 2.0 project. When calling it like so:
var graph = OpenGraph.ParseUrl(item.ReferenceUrl);
I get the following error:
I can trace this to the conditional compilation part here: https://github.com/ghorsey/OpenGraph-Net/blob/develop/src/OpenGraphNet/OpenGraph.cs#L361 I have tried to upgrade my project to .NET Standard 2.1, but this did not change the behavior. I also verified that the compiler directive is there. Feels like I'm missing something here, but I'm not quite sure what and was hoping you might know more.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A successfully parsing OpenGraph object.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.