ClojureDocs

Nav

Namespaces

parse

clojure.xml

Available since 1.0
  • (parse s)
  • (parse s startparse)
Parses and loads the source s, which can be a File, InputStream or
String naming a URI. Returns a tree of the xml/element struct-map,
which has the keys :tag, :attrs, and :content. and accessor fns tag,
attrs, and content. Other parsers can be supplied by passing
startparse, a fn taking a source and a ContentHandler and returning
a parser.
 Prior to 1.11, used startparse-sax by default. As of 1.11, uses
startparse-sax-safe, which disables XXE (XML External Entity)
processing. Pass startparse-sax to revert to prior behavior.
2 Examples
(require '[clojure.xml :as xml]
         '[clojure.zip :as zip])

;;convenience function, first seen at nakkaya.com later in clj.zip src
(defn zip-str [s]
  (zip/xml-zip 
      (xml/parse (java.io.ByteArrayInputStream. (.getBytes s)))))

;;parse from xml-strings to internal xml representation
user=> (zip-str "<a href='nakkaya.com'/>")
[{:tag :a, :attrs {:href "nakkaya.com"}, :content nil} nil]

;;root can be rendered with xml/emit-element
user=> (xml/emit-element (zip/root [{:tag :a, :attrs {:href "nakkaya.com"}, :content nil} nil]))
<a href='nakkaya.com'/>

;;printed (to assure it's not lazy and for performance), can be caught to string variable with with-out-str
;; Use the "startparse" parameter to disable 
;; validation and network requests for external DTDs

(require '[clojure.xml :as xml]
         '[clojure.java.io :as io])
(import '[javax.xml.parsers SAXParserFactory])

(def conforming
  "<?xml version='1.0' encoding='utf-8'?>
     <!DOCTYPE html SYSTEM 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'>
     <html xmlns='http://www.w3.org/1999/xhtml'>
       <article>Hello</article>
     </html>")

(defn non-validating [s ch]
  (..
    (doto
      (SAXParserFactory/newInstance)
      (.setFeature 
      "http://apache.org/xml/features/nonvalidating/load-external-dtd" false))
    (newSAXParser)
    (parse s ch)))

(def xml
  (-> conforming .getBytes io/input-stream (xml/parse non-validating)))

;; {:tag :html,
;;  :attrs {:xmlns "http://www.w3.org/1999/xhtml"},
;;  :content [{:tag :article, :attrs nil, :content ["Hello"]}]}
See Also

Evaluates exprs in a context in which *out* is bound to a fresh StringWriter. Returns the string ...

Added by Claj
0 Notes
No notes for parse