Git Product home page Git Product logo

Comments (3)

ChrisTrenkamp avatar ChrisTrenkamp commented on August 20, 2024 5

Been busy and still have no time to work on this, but finally came up with some realistic pseudocode that outlines some fixes to what I've complained about:

package main

import "github.com/ChrisTrenkamp/xsel"

type FooStruct struct {
    Foo string `xml:"string"`
}

func main() {
    xReader, _ := xsel.XmlReader(os.Open("..."))
    store, _ := xsel.InMemoryStore(xReader)

    for result := range xsel.Select("//foo", store) {
        f := &FooStruct{}
        _ := xsel.UnmarshalXml(f, result)
    }

    //Each of these returns an error as well, but it should get the point across
    var arrResult []xsel.Result = xsel.All(xsel.Select("//foo", store))
    var strResult string = xsel.AsString(xsel.Select("//foo", store))
    var numResult float64 = xsel.AsNum(xsel.Select("//foo", store))
}

First things first, it'll be a complete rewrite. The only thing that'll be transferred over is all the test cases. Let's call it xsel; it's easier to type than goxpath and has a bit of a catchier name (pretty sure I would fail marketing 101).

The first interface is the NodeReader (xReader), which will be equivalent to Java's STaX interface, in that it only reads the next node when it's told to, and does not store it in memory. A default implementation will be supplied for reading XML files (xsel.XmlReader), and drop-in replacements for JSON or anything can be implemented.

The second interface is the Storage. It will be responsible for appending and removing nodes from the tree. A default in-memory storage will be supplied, but drop-in replacements using a database-backed storage like BoltDB can be created.

When submitting an XPath query (xsel.Select), it'll return a struct channel with two fields: an error and a value. This, combined with the STaX-like puller, and a disk-backed storage, will allow for the memory-efficiency and streaming behaviour that goxpath has struggled to accomplish.

A common situation I kept running into was unmarshalling the XML into a struct, which, to be honest, is probably what 95% of people using XPath really need. There will be a default implementation for reading it into xml-tagged structs.

The last three function calls are simply helpers around the XPath query.

from goxpath.

suntong avatar suntong commented on August 20, 2024

subscribe.

from goxpath.

ChrisTrenkamp avatar ChrisTrenkamp commented on August 20, 2024

Better 4 years late than never, I suppose:

https://github.com/ChrisTrenkamp/xsel

Here's a recap on what I wanted to accomplish:

tree's are not streamed

A streaming API was incredibly difficult to create, and difficult and awkward to use. This goal has been trashed.

tree's are format-agnostic, but the core library depends specifically on XML elements

xsel's core XPath logic is disconnected from any direct XML dependencies. XML tropes such as namespaces and processing instructions are defined in the package, but are not directly tied to any XML libraries and are not required.

tree's cannot be altered

xsel has two general interfaces, node's for defining the data points themselves, and cursor's for defining the parent-child relationships and position numbers. It mandates the order in which namespaces and attributes are created, and it mandates how position numbers are created, but nothing else beyond that. It should be possible to define your own interface for creating modifiable tree structures.

The public API is ugly

The following is an example for defining a custom function's in xsel. Should be easier to read and follow.

package main

import (
	"bytes"
	"fmt"

	"github.com/ChrisTrenkamp/xsel/exec"
	"github.com/ChrisTrenkamp/xsel/grammar"
	"github.com/ChrisTrenkamp/xsel/node"
	"github.com/ChrisTrenkamp/xsel/parser"
	"github.com/ChrisTrenkamp/xsel/store"
)

func main() {
	xml := `
<root>
	<a>This is an element.</a>
	<!-- This is a comment. -->
</root>
`

	isComment := func(context exec.Context, args ...exec.Result) (exec.Result, error) {
		nodeSet, isNodeSet := context.Result().(exec.NodeSet)

		if !isNodeSet || len(nodeSet) == 0 {
			return exec.Bool(false), nil
		}

		_, isComment := nodeSet[0].Node().(node.Comment)
		return exec.Bool(isComment), nil
	}

	contextSettings := func(c *exec.ContextSettings) {
		c.FunctionLibrary[exec.Name("", "is-comment")] = isComment
	}

	xpath := grammar.MustBuild(`//node()[is-comment()]`)
	parser := parser.ReadXml(bytes.NewBufferString(xml))
	cursor, _ := store.CreateInMemory(parser)
	result, _ := exec.Exec(cursor, &xpath, contextSettings)

	fmt.Println(result) // This is a comment.
}

There are also other improvements, such as separating the node information from the parent-child/position relationship's, making the API cleaner and easier to extend. It also uses a parser generator. It was a fun learning experience to hand-write my own lexer/parser, but it was also a mess and a nightmare to maintain.

from goxpath.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.