atifaziz / hazz Goto Github PK
View Code? Open in Web Editor NEWCSS Selectors (via Fizzler) for HtmlAgilityPack (HAP)
License: Other
CSS Selectors (via Fizzler) for HtmlAgilityPack (HAP)
License: Other
At the same time, the .NET Framework 3.5 target will be dropped otherwise Fizzler 1.2.0, which targets .NET Standard, can't be used as a dependency.
HtmlDocument
from http://shoryuken.com/forum/index.php?events/monthlydocument.CssSelect("td.primaryContent.weekends.nowWeek.nowToday")
I expect one TD element to be returned. However, there are tabs, carriage returns, and linefeeds in the class attribute on the tag, and only the first class selector (td.primaryContent
) works.
1.0.0.0 - Windows 7
Please provide any additional information below.
Originally reported on Google Code with ID 51
Reported by [email protected]
on 2012-05-20 07:48:08
I have to parse and query HTML that's formatted badly with line breaks in class attributes. This library does not appear to support those however:
string html = @"<html><body><div class=""class_1""><span class=""class_2
class_3"">Text</span></body></html>";
HtmlDocument htmldom = new HtmlDocument();
htmldom.LoadHtml(html);
Console.WriteLine(JsonConvert.SerializeObject(htmldom.DocumentNode.QuerySelector(".class_1").FirstChild.GetClasses())); // Prints classes correctly
Console.WriteLine(htmldom.DocumentNode.QuerySelector(".class_1 > .class_2")); // Prints null
It works fine when the line break is removed.
Code to reproduce (using version 1.2):
using var client = new HttpClient();
var doc = new HtmlDocument();
doc.LoadHtml(await client.GetStringAsync("https://www.example.com/"));
foreach (var e in doc.DocumentNode.QuerySelectorAll(":only-child"))
Console.WriteLine(e.Name);
It finds only 2 elements:
div
a
whereas using document.querySelectorAll(':only-child')
in Chrome finds html
in addition to the above two.
Running sourcelink test
on Fizzler.Systems.HtmlAgilityPack.1.2.1-ci-20200407t1951.symbols.nupkg (also attached) for 2ca0a2c produces errors:
1 Documents without URLs:
8edef9391a47680caec71f4ded6c435fd60d04067808138edf6d8fb16a81b307 sha256 csharp C:\Users\appveyor\AppData\Local\Temp\1\.NETStandard,Version=v1.3.AssemblyAttributes.cs
1 Documents with errors:
b76e2bda9a5853a1c1e97a6badca21a35a6039bb21e06cf89156470b5a02ee1d sha256 csharp C:\projects\hazz\src\obj\Release\netstandard1.3\Fizzler.Systems.HtmlAgilityPack.AssemblyInfo.cs
https://raw.githubusercontent.com/atifaziz/Hazz/2ca0a2ceed2146d0049d81654cd66ea75da2e607/src/obj/Release/netstandard1.3/Fizzler.Systems.HtmlAgilityPack.AssemblyInfo.cs
error: url failed NotFound: Not Found
sourcelink test failed
failed for lib/netstandard1.3/Fizzler.Systems.HtmlAgilityPack.pdb
1 Documents without URLs:
a6e03ae4df13fe05345e9022d1f1cd24ecae4bfd66db4843697c855d9f9335f4 sha256 csharp C:\Users\appveyor\AppData\Local\Temp\1\.NETStandard,Version=v2.0.AssemblyAttributes.cs
1 Documents with errors:
b76e2bda9a5853a1c1e97a6badca21a35a6039bb21e06cf89156470b5a02ee1d sha256 csharp C:\projects\hazz\src\obj\Release\netstandard2.0\Fizzler.Systems.HtmlAgilityPack.AssemblyInfo.cs
https://raw.githubusercontent.com/atifaziz/Hazz/2ca0a2ceed2146d0049d81654cd66ea75da2e607/src/obj/Release/netstandard2.0/Fizzler.Systems.HtmlAgilityPack.AssemblyInfo.cs
error: url failed NotFound: Not Found
sourcelink test failed
failed for lib/netstandard2.0/Fizzler.Systems.HtmlAgilityPack.pdb
2 files did not pass in dist\Fizzler.Systems.HtmlAgilityPack.1.2.1-ci-20200407t1951.symbols.nupkg
Would it be possible to make the requirement AgilityPack >=1.5.1 instead of AgilityPack = 1.5.1?
Hello there.
I am creating a WYSIWYG editor in Blazor and am using CSS nth-of-type
to create a unique selector.
When I run my code I get the following error:
System.FormatException: Unknown functional pseudo 'nth-of-type'. Only nth-child and nth-last-child are supported.
For simple, non-nested elements, like SECTION H1:nth-of-type(1)
I can use a work-around
HtmlNode element = null;
if (sel.Contains("nth-of-type"))
{
var splitty = sel.Split(':');
var elements = rootNode.QuerySelectorAll(splitty[0]);
var n = int.Parse(Regex.Match(splitty[1], @"(\d)").Value);
element = elements[n - 1];
}
But I hit a snag for more complex structures.
Any change of implementing nth-of-type
?
You have nth-child
, how much work would it be to have nth-of-type
?
Thanks
It would be nice if the NuGet package also targeted .NET Standard 2.0. Otherwise a lot of extra dependencies are added if you are using this package to target a platform that is .NET Standard 2.0 compliant.
Add support for negation pseudo-class selector now that Fizzler supports it since 1.3.0 beta 1.
HtmlNodeSelection.CachableCompile
throws an instance of NullReferenceException
when called from any secondary thread, as demonstrated below:
static void Main()
{
void Test()
{
Console.WriteLine("Running on thread #" + Thread.CurrentThread.ManagedThreadId);
HtmlNodeSelection.CachableCompile("p");
}
Test(); // succeeds
var t = new Thread(Test);
t.Start(); // fails
t.Join();
}
Given a selector that includes an element name (i.e tagName), the method
IEnumerable<HtmlNode> HtmlNode.QuerySelectorAll(string selector)
will perform a case-sensitive search for the name.
In other words:
HtmlDocument document = new HtmlDocument();
document.LoadHtml("<html><body><a></a></body></html>");
document.DocumentNode.QuerySelectorAll("A"); // should select the <a> but will select nothing.
document.DocumentNode.QuerySelectorAll("a"); // works fine.
W3C docs specify CSS selectors are not case-sensitive, except for specific attributes such as class or ID.
This also how browsers' (at least my Chrome) querySelectorAll()
will behave.
see also: https://stackoverflow.com/questions/7559205/are-css-selectors-case-sensitive/7559251
From @raiytu4 on June 25, 2018 7:41
I demonstrate this problem here:
https://github.com/raiytu4/GetEnumeratorOnFizzler
for short:
static void Main(string[] args)
{
var doc = Browser.GetDoc("http://truyenfull.vn/dai-dao-trieu-thien/").DocumentNode;
var categoryTextNodes = FindLiStoryDes(doc, "thể loại").QuerySelectorAll("div.cp2 > a");
if (categoryTextNodes != null)
{
// categoryTextNodes is not null
// it's ok to call .GetEnumerator()
// but call categoryTextNodes.GetEnumerator().MoveNext() will throws get NullReferenceException
categoryTextNodes.GetEnumerator();
// throws NullReferenceException too!
categoryTextNodes.Count();
}
Console.ReadLine();
}
Copied from original issue: atifaziz/Fizzler#66
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.