Git Product home page Git Product logo

xquery's People

Contributors

highway900 avatar mildred avatar zhengchun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

xquery's Issues

匹配错误

就直接中文吧

<span class="fr date">
    <span style="color:#999;font-size:8px;">
        <script type="text/javascript">
            //something
        </script>
    </span>
&nbsp;&nbsp;2017-05-11
</span>

第一个问题:
表达式://span[@class='date']
直接出现panic匹配错误,没有返回err吗?

第二个问题
表达式://span[@class]
返回正常,但是InnerText返回的值还包含了script代码,而不是纯文本?

错误的表达式会触发panic err

错误的表达式会触发panic err
node := htmlquery.FindOne(root, "span[@Class='test')") //这句
if node == nil {
//something
}
span[@class='test') 应该是 span[@class='test']
能否屏蔽 panic err?

TextNodes between two other Nodes are ignored

I have the following XML document:

<?xml version="1.0" encoding="UTF-8"?>
<info>
This is the first sentence in the info tag.
<ext>http://example.org/1</ext>
This sentence is between two ext tags.
<ext>http://test.org/2</ext>
This sentence is at the end of the info tag.
</info>

The TextNode between the two <ext> tags ("This sentence is between two ext tags.") is ignored when attempting to parse the XML using xquery.Parse().

Example code:

package main

import (
	"bytes"
	"log"

	"github.com/antchfx/xquery/xml"
)

func main() {
	b := bytes.NewBuffer([]byte(`<?xml version="1.0" encoding="UTF-8"?>
<info>
This is the first sentence in the info tag.
<ext>http://example.org/1</ext>
This sentence is between two ext tags.
<ext>http://test.org/2</ext>
This sentence is at the end of the info tag.
</info>`))

	root, err := xmlquery.Parse(b)
	if err != nil {
		log.Fatalf("Error parsing XML: %v", err)
	}
	log.Print(root.OutputXML(false))
}

Output:

<xml version="1.0" encoding="UTF-8"></xml>
<info>This is the first sentence in the info tag.
<ext>http://example.org/1</ext>
<ext>http://test.org/2</ext>
This sentence is at the end of the info tag.
</info>

As you can see, the text between the two <ext> tags is missing.

XPath `//dc:creator` not found element

package main

import (
	"fmt"
	"github.com/antchfx/xquery/xml"
)

func main() {
	root, err := xmlquery.LoadURL("http://cn.engadget.com/rss.xml")
	if err != nil {
		panic(err)
	}
	item := xmlquery.FindOne(root, "//channel/item[1]/dc:creator")
}

item is nil

获取日期字符串不准确

<span>2017/05/08<span>
//...some code
node := htmlquery.FindOne(root, "//span")
text := htmlquery.InnerText(node)

text得到的结果并不是 2017/05/08,而是2017-05-08 00:00:00.0,是不是有问题?

multiple Proc Inst statement in the xml file

for example:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?>
<?xml-stylesheet type="text/css" media="screen" href="http://feeds.reuters.com/~d/styles/itemcontent.css"?>
<rss xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">
....
</rss>

when xquery parse this xml file with multiple PI statement will panic error xml: document is invalid.

建议OutputHTML使用html包的原生render方法

// OutputHTML returns the text including tags name.
func OutputHTML(n *html.Node) string {
	var buf bytes.Buffer
	if err := html.Render(&buf, n); err != nil {
		return ""
	}
	// outputXML(&buf, n)
	return buf.String()
}

Typo in xml leads to error

my xml (with a typo in the closing tag of three

<?xml version="1.0"?>
<Scenarios>
    <scenario>
        <one>1</one>
        <two>2</two>
        <three>3</tree>
    </scenario>
</Scenarios>

when I do this:

	allScenarios := xmlquery.Find(memXML, "//scenario")
	fmt.Println(len(allScenarios))

It doesn't come to to printing of the len, but it raises me an error.
(And I'm not capable of handling it)

Attributes nodes probably not handled correctly

Hi,

First, thank you for your work on xpath and xquery for go, I'm adapting your xpath library to another dom structure I have and I looked at the code here to understand how the NodeNavigator worked.

I think I found inconsistencies in your NodeNavigator around attribute nodes, due to the fact that your dom structure don't have first class attribute nodes.

In particular, trying the xpath expression /a/b/@attr/.. on a document <a><b attr="1"/></a> should select <b attr="1"/>. It seems your implementation selects <a>...</a> instead.

I think that this is because when you MoveToParent() when an attribute is selected moves you to the grandparent of the attribute instead of its parent. It moves to the parent of the element containing the attribute.

When trying to test this kind of xml/query_test.go with

	if node := FindOne(doc, "//book/@id/..[1]"); node.SelectAttr("id") != "bk101" {
		t.Fatal("//book/@id/...[1]/@id != bk101")
	}

it seems it selects no node, node is nil and I get a stack trace

Problem with installation

Hi,
When I try to install it on ubuntu 16.04, I am getting the following error:
can't load package: package github.com/antchfx/xquery: no buildable Go source files in /home/ps06756/go/src/github.com/antchfx/xquery, when I run the following command:

go get github.com/antchfx/xquery

can't return attribute

<ul>
  <li><p><a href="test.html"></a></p></li>
</ul>

expr: //ul/li/p/a/@href
htmlquery. FindOne return a Node, is not attribute, I want to the attribute string, If you use htmlquery.SelectAttr, I also need to split the expression:

split := strings.Split("//ul/li/p/a@href", "@")
node := htmlquery.FindOne(root, split[0])
attribute  := htmlquery.SelectAttr(node, split[1])

Too many steps

Can I use an expression in one step? Because the xpath expression can be done, not of using htmlquery.SelectAttr

Differente in usage

More a question
To get an attribute in html:

href := htmlquery.SelectAttr(tag, "href")

to get it in xml:

debug := tag.SelectAttr("debug")

Why the difference?

不建议使用strings.TrimSpace

strings.TrimSpace会做一些郁闷的事情,比如去除换行,去掉空格,去掉NBSP,但去掉的这些都是内容,直接输出原生的文本就好,至于结果由使用者自己处理。

//@href doesn't work

Hello!
//*[@id="whatsnew"]/div/div[2]/strong/a//@href doesn't work as expected.
My code:

package main

import (
	"fmt"
	"net/url"

	"github.com/antchfx/xquery/html"
)

func main() {
	root, err := htmlquery.LoadURL("https://www.apkmirror.com/apk/niantic-inc/pokemon-go")
	if err != nil {
		panic(err)
	}
	urlToAPK := htmlquery.InnerText(htmlquery.FindOne(root, "//*[@id=\"whatsnew\"]/div/div[2]/strong/a//@href"))
	fmt.Println(urlToAPK)
}

and i got Pokémon GO 0.55.0 instead if link.
Is it my fault or your?

Not able to install xquery

go get github.com/antchfx/xquery results in
can't load package: package github.com/antchfx/xquery: no buildable Go source files in /Users/i335366/go/src/github.com/antchfx/xquery, please help.

Not work on http://www.lostfilm.tv/

When I do

`
package main

import (
"fmt"
"log"
"net/http"

"github.com/antchfx/xquery/html"
"golang.org/x/net/html"

)

func main() {

resp, err := http.Get("http://www.lostfilm.tv/")

if err != nil {
    log.Fatal(err)
}

defer resp.Body.Close()

root, err := html.Parse(resp.Body)
if err != nil {
    log.Fatal(err)
}

var xpath string
xpath = `//title`

node := htmlquery.FindOne(root, xpath)
fmt.Println(htmlquery.InnerText(node))

}
`

I get:

panic: unknown HTML node type: 5
goroutine 1 [running]:
panic(0x609180, 0xc042121620)
C:/Go/src/runtime/panic.go:500 +0x1af
github.com/antchfx/xquery/html.(_htmlNodeNavigator).NodeType(0xc04211d700, 0x0)
D:/Projects/gopath/src/github.com/antchfx/xquery/html/query.go:120 +0x130
github.com/antchfx/gxpath/internal/build.axisPredicate.func1(0x769000, 0xc04211d700, 0x40f700)
D:/Projects/gopath/src/github.com/antchfx/gxpath/internal/build/build.go:45 +0x4e
github.com/antchfx/gxpath/internal/query.(_DescendantQuery).Select.func1(0x63cc80, 0xc0421df350)
D:/Projects/gopath/src/github.com/antchfx/gxpath/internal/query/query.go:236 +0xa9
github.com/antchfx/gxpath/internal/query.(_DescendantQuery).Select(0xc0421df320, 0x763760, 0xc04211d6c0, 0x0, 0xc04211d6c0)
D:/Projects/gopath/src/github.com/antchfx/gxpath/internal/query/query.go:243 +0x3d
github.com/antchfx/gxpath.(_NodeIterator).MoveNext(0xc04211d6c0, 0xc04211d640)
D:/Projects/gopath/src/github.com/antchfx/gxpath/select.go:22 +0x50
github.com/antchfx/xquery/html.FindOne(0xc042010150, 0x65de4e, 0x7, 0x0)
D:/Projects/gopath/src/github.com/antchfx/xquery/html/query.go:33 +0xb4
main.main()
D:/Projects/gopath/src/Test2/main.go:30 +0x2d8

xml missing XML declaration

I found some XML file missing this declaration: <?xml version="1.0" encoding="UTF-8"?>. the xmlquery package parse will return error about invalid xml.

In the most cases, the xmlquery package should be compatible this case. if xml file missing this declaration will take <?xml version="1.0" ?> as default declaration.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.