SAXParse Exception as PE: column = Column Number() line = Line Number() msg = Message() value = msg " " str(line) " " str(column) return (file, value) except Value Error: return (file, "Value Error.
DTD uri not found.") # that can happen def test_directory_sax(self, dir_path): tuples =  for ind, file in enumerate(os.listdir(dir_path), 1): if file.endswith('.xml'): tuples.append(self.test_each_file(dir_path file)) # convert into dict and sort it by key (file number) dict_of_errors = dict(tuples) dict_of_errors = collections.
This is the code: import from import handler, make_parser, parse import os import collections class Sax Parser(): # initializer with directory part as argument def __init__(self, dir_path): self.dir_path = dir_path def test_each_file(self, file_path): # ensure full file name is shown rev = file_path[::-1] # reverse string file_path to access position of "/" file = file_path[-rev.index("/"):] try: f = open(file_path, 'r', encoding="ISO-8859-1") # same as "latin-1" encoding # see this for enabling validation: # https://stackoverflow.com/questions/6349513/parsing-xml-entity-with-python-xml-sax parser = make_parser() # default parser is expat Content Handler(handler.
Content Handler()) Feature(handler.feature_namespaces, True) Feature(handler.feature_validation, True) Feature(handler.feature_external_ges, True) parser.parse(f) f.close() return (file, "OK") except
object has been created, various attributes of the object can be set to handler functions.
When an XML document is then fed to the parser, the handler functions are called for the character data and markup in the XML document.
* 4DOM: A fully compliant DOM Level 2 implementation * javadom: An adapter from Java DOM implementations to the standard Python DOM binding.
* pulldom: a DOM implementation that supports lazy instantiation of nodes.
I'm trying to check validity of XML files (against DTDs, entities, Processing instructions, namespaces) in Python 3.4.
* marshal: a module with several options for serializing Python objects to XML, including WDDX and XML-RPC.
4Suite is a toolkit for XML and RDF application development.
It features a library of integrated tools for XML processing, implementing open technologies such as DOM, RDF, XSLT, XInclude, XPointer, XLink, XPath, XUpdate, RELAX NG, and XML/SGML Catalogs.
Layered upon this is an XML and RDF data repository and server, which supports multiple methods of data access, query, indexing, transformation, rich linking, and rule processing, and provides the data infrastructure of a full database system, including transactions, concurrency, access control, and management tools.