Saltar al contenido

¿Cómo validar un archivo XML usando Java con un XSD que tiene una inclusión?

Después de tanto luchar pudimos hallar la contestación de este enigma que muchos de nuestros usuarios de este espacio han presentado. Si quieres aportar algo más no dudes en compartir tu conocimiento.

Solución:

necesita usar un LSResourceResolver para que esto funcione. por favor, eche un vistazo al código de muestra a continuación.

un método de validación:

// note that if your XML already declares the XSD to which it has to conform, then there's no need to declare the schemaName here
void validate(String xml, String schemaName) throws Exception 

    DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
    builderFactory.setNamespaceAware(true);

    DocumentBuilder parser = builderFactory
            .newDocumentBuilder();

    // parse the XML into a document object
    Document document = parser.parse(new StringInputStream(xml));

    SchemaFactory factory = SchemaFactory
            .newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);

    // associate the schema factory with the resource resolver, which is responsible for resolving the imported XSD's
    factory.setResourceResolver(new ResourceResolver());

            // note that if your XML already declares the XSD to which it has to conform, then there's no need to create a validator from a Schema object
    Source schemaFile = new StreamSource(getClass().getClassLoader()
            .getResourceAsStream(schemaName));
    Schema schema = factory.newSchema(schemaFile);

    Validator validator = schema.newValidator();
    validator.validate(new DOMSource(document));

la implementación de resolución de recursos:

public class ResourceResolver  implements LSResourceResolver 

public LSInput resolveResource(String type, String namespaceURI,
        String publicId, String systemId, String baseURI) 

     // note: in this sample, the XSD's are expected to be in the root of the classpath
    InputStream resourceAsStream = this.getClass().getClassLoader()
            .getResourceAsStream(systemId);
    return new Input(publicId, systemId, resourceAsStream);


 

La implementación de entrada devuelta por el solucionador de recursos:

public class Input implements LSInput 

private String publicId;

private String systemId;

public String getPublicId() 
    return publicId;


public void setPublicId(String publicId) 
    this.publicId = publicId;


public String getBaseURI() 
    return null;


public InputStream getByteStream() 
    return null;


public boolean getCertifiedText() 
    return false;


public Reader getCharacterStream() 
    return null;


public String getEncoding() 
    return null;


public String getStringData() 
    synchronized (inputStream) 
        try 
            byte[] input = new byte[inputStream.available()];
            inputStream.read(input);
            String contents = new String(input);
            return contents;
         catch (IOException e) 
            e.printStackTrace();
            System.out.println("Exception " + e);
            return null;
        
    


public void setBaseURI(String baseURI) 


public void setByteStream(InputStream byteStream) 


public void setCertifiedText(boolean certifiedText) 


public void setCharacterStream(Reader characterStream) 


public void setEncoding(String encoding) 


public void setStringData(String stringData) 


public String getSystemId() 
    return systemId;


public void setSystemId(String systemId) 
    this.systemId = systemId;


public BufferedInputStream getInputStream() 
    return inputStream;


public void setInputStream(BufferedInputStream inputStream) 
    this.inputStream = inputStream;


private BufferedInputStream inputStream;

public Input(String publicId, String sysId, InputStream input) 
    this.publicId = publicId;
    this.systemId = sysId;
    this.inputStream = new BufferedInputStream(input);


La respuesta aceptada está perfectamente bien, pero no funciona con Java 8 sin algunas modificaciones. También sería bueno poder especificar una ruta base desde la cual se leen los esquemas importados.

He usado en mi Java 8 el siguiente código que permite especificar una ruta de esquema incrustada que no sea la ruta raíz:

import com.sun.org.apache.xerces.internal.dom.DOMInputImpl;
import org.w3c.dom.ls.LSInput;
import org.w3c.dom.ls.LSResourceResolver;

import java.io.InputStream;
import java.util.Objects;

public class ResourceResolver implements LSResourceResolver 

    private String basePath;

    public ResourceResolver(String basePath) 
        this.basePath = basePath;
    

    @Override
    public LSInput resolveResource(String type, String namespaceURI, String publicId, String systemId, String baseURI) 
        // note: in this sample, the XSD's are expected to be in the root of the classpath
        InputStream resourceAsStream = this.getClass().getClassLoader()
                .getResourceAsStream(buildPath(systemId));
        Objects.requireNonNull(resourceAsStream, String.format("Could not find the specified xsd file: %s", systemId));
        return new DOMInputImpl(publicId, systemId, baseURI, resourceAsStream, "UTF-8");
    

    private String buildPath(String systemId) 
        return basePath == null ? systemId : String.format("%s/%s", basePath, systemId);
    

Esta implementación también le da al usuario un mensaje significativo en caso de que el esquema no se pueda leer.

Tuve que hacer algunas modificaciones a esta publicación de AMegmondoEmber

Mi archivo de esquema principal tenía algunas inclusiones de carpetas hermanas, y los archivos incluidos también tenían algunas inclusiones de sus carpetas locales. También tuve que rastrear la ruta del recurso base y la ruta relativa del recurso actual. Este código funciona para mí, pero tenga en cuenta que asume que todos los archivos xsd tienen un nombre único. Si tiene algunos archivos xsd con el mismo nombre, pero contenido diferente en diferentes rutas, probablemente le dará problemas.

import java.io.ByteArrayInputStream;
import java.io.InputStream;
import java.util.HashMap;
import java.util.Map;
import java.util.Scanner;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.w3c.dom.ls.LSInput;
import org.w3c.dom.ls.LSResourceResolver;

/**
 * The Class ResourceResolver.
 */
public class ResourceResolver implements LSResourceResolver 

    /** The logger. */
    private final Logger logger = LoggerFactory.getLogger(this.getClass());

    /** The schema base path. */
    private final String schemaBasePath;

    /** The path map. */
    private Map pathMap = new HashMap();

    /**
     * Instantiates a new resource resolver.
     *
     * @param schemaBasePath the schema base path
     */
    public ResourceResolver(String schemaBasePath) 
        this.schemaBasePath = schemaBasePath;
        logger.warn("This LSResourceResolver implementation assumes that all XSD files have a unique name. "
                + "If you have some XSD files with same name but different content (at different paths) in your schema structure, "
                + "this resolver will fail to include the other XSD files except the first one found.");
    

    /* (non-Javadoc)
     * @see org.w3c.dom.ls.LSResourceResolver#resolveResource(java.lang.String, java.lang.String, java.lang.String, java.lang.String, java.lang.String)
     */
    @Override
    public LSInput resolveResource(String type, String namespaceURI,
            String publicId, String systemId, String baseURI) 
        // The base resource that includes this current resource
        String baseResourceName = null;
        String baseResourcePath = null;
        // Extract the current resource name
        String currentResourceName = systemId.substring(systemId
                .lastIndexOf("/") + 1);

        // If this resource hasn't been added yet
        if (!pathMap.containsKey(currentResourceName)) 
            if (baseURI != null) 
                baseResourceName = baseURI
                        .substring(baseURI.lastIndexOf("/") + 1);
            

            // we dont need "./" since getResourceAsStream cannot understand it
            if (systemId.startsWith("./")) 
                systemId = systemId.substring(2, systemId.length());
            

            // If the baseResourcePath has already been discovered, get that
            // from pathMap
            if (pathMap.containsKey(baseResourceName)) 
                baseResourcePath = pathMap.get(baseResourceName);
             else 
                // The baseResourcePath should be the schemaBasePath
                baseResourcePath = schemaBasePath;
            

            // Read the resource as input stream
            String normalizedPath = getNormalizedPath(baseResourcePath, systemId);
            InputStream resourceAsStream = this.getClass().getClassLoader()
                    .getResourceAsStream(normalizedPath);

            // if the current resource is not in the same path with base
            // resource, add current resource's path to pathMap
            if (systemId.contains("/")) 
                pathMap.put(currentResourceName, normalizedPath.substring(0,normalizedPath.lastIndexOf("/")+1));
             else 
                // The current resource should be at the same path as the base
                // resource
                pathMap.put(systemId, baseResourcePath);
            
            Scanner s = new Scanner(resourceAsStream).useDelimiter("\A");
            String s1 = s.next().replaceAll("\n", " ") // the parser cannot understand elements broken down multiple lines e.g. ()
                    .replace("\t", " ") // these two about whitespaces is only for decoration
                    .replaceAll("\s+", " ").replaceAll("[^\x20-\x7e]", ""); // some files has a special character as a first character indicating utf-8 file
            InputStream is = new ByteArrayInputStream(s1.getBytes());

            return new LSInputImpl(publicId, systemId, is); // same as Input class
        

        // If this resource has already been added, do not add the same resource again. It throws
        // "org.xml.sax.SAXParseException: sch-props-correct.2: A schema cannot contain two global components with the same name; this schema contains two occurrences of ..."
        // return null instead.
        return null;
    

    /**
     * Gets the normalized path.
     *
     * @param basePath the base path
     * @param relativePath the relative path
     * @return the normalized path
     */
    private String getNormalizedPath(String basePath, String relativePath)
        if(!relativePath.startsWith("../"))
            return basePath + relativePath;
        
        else
            while(relativePath.startsWith("../"))
                basePath = basePath.substring(0,basePath.substring(0, basePath.length()-1).lastIndexOf("/")+1);
                relativePath = relativePath.substring(3);
            
            return basePath+relativePath;
        
    

Recuerda recomendar este post si te fue útil.

¡Haz clic para puntuar esta entrada!
(Votos: 0 Promedio: 0)



Utiliza Nuestro Buscador

Deja una respuesta

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *