Git Product home page Git Product logo

cl-html5-parser's Introduction

cl-html5-parser: HTML5 parser for Common Lisp

Abstract

cl-html5-parser is a HTML5 parser for Common Lisp with the following features:

  • It is a port of the Python library html5lib.

  • It passes all relevant tests from html5lib.

  • It is not tied to a specific DOM implementation.

Requirements

  • SBCL or ECL.

  • CL-PPCRE and FLEXI-STREAMS.

Might work with CLISP, ABCL and Clozure CL, but many of the tests don’t pass there.

Usage

Parsing

Parsing functions are in the package HTML5-PARSER.

f parse-html5 source &key encoding strictp ⇒ document, errors

Parse an HTML document from source. Source can be a string, a pathname or a stream. When parsing from a stream encoding detection is not supported, encoding must be supplied via the encoding keyword parameter.

When strictp is true, parsing stops on first error.

Returns two values. The primary value is the document node. The secondary value is a list of errors found during parsing. The format of this list is subject to change.

f parse-html5-fragment source &key container encoding strictp as-xmls ⇒ document-fragment, errors

Parses a fragment of HTML. Container sets the context, defaults to "div". Returns a document-fragment node. For the other parameters see PARSE-HTML5.

Example

(html5-parser:node-to-xmls (html5-parser:parse-html5-fragment "Parse <i>some</i> HTML"))
==> ("Parse " ("i" NIL "some") " HTML")

The DOM

Parsing HTML5 is not possible without a DOM. cl-html5-parser defines a minimal DOM implementation for this task. Functions for traversing documents are exported by the HTML5-PARSER package.

Converting to XMLS

f html5-parser:node-to-xmls node &optional include-namespace-p ⇒ list

Converts a node into a simple XMLS-like list structure. If node is a document fragement a list of XMLS nodes a returned. In all other cases a single XMLS node is returned.

License

This library is available under the GNU Lesser General Public License v3.0.

cl-html5-parser's People

Contributors

bakketun avatar hanshuebner avatar sletner avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.