Hash XML file based on UBL 2.1 in PHH

mohammadsaid1980 · May 1, 2022, 11:26am

I have an XML file based on UBL 2.1 for ZATCA and I have the SDK file for developer to test their result and I have the hashed output and want to hash it in PHP, I tried to take the same XML content and hash it in PHP but the result is not the same, then after deep reading about XML and I tried to decompile the CLASS files of SDK to understand how do they hash it! then I found these classes but I am trying to understand it:

   public static String removeUnneededTags(final File file) {
        final List<String> pathsToBeRemoved = new ArrayList<String>();
        pathsToBeRemoved.add("//*[local-name()='Invoice']//*[local-name()='UBLExtensions']");
        pathsToBeRemoved.add("//*[local-name()='Invoice']//*[local-name()='Signature']");
        pathsToBeRemoved.add("//*[local-name()='Invoice']//*[local-name()='AdditionalDocumentReference']//*[text()= 'QR']/parent::*");
        final String cleanXml = removeElmentListByXpath(file, pathsToBeRemoved);
        final String removedNewlinesAndSpaces = cleanXml.replaceAll("[\\n\\t ]", "").replaceAll("\r", "");
        return removedNewlinesAndSpaces;
    }

what I understand is I have to remove UBLExtensions and Signature and AdditionalDocumentReference related to QR elements plust I have to remove all new lines and spaces? \\n\\t I already removed the elements and new lines and spaces to make the XML file as one line! that is what I understand, but still not get the same hashed output!

and I fund this class but didn’t get the idea from it:

public static String removeElmentListByXpath(final File file, final List<String> paths) {
        try {
            final String result = IOUtils.toString((Reader)new FileReader(file));
            final DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
            dbf.setNamespaceAware(true);
            final DocumentBuilder db = dbf.newDocumentBuilder();
            final org.w3c.dom.Document doc = db.parse(new ByteArrayInputStream(result.getBytes(StandardCharsets.UTF_8)));
            final XPathFactory xPathfactory = XPathFactory.newInstance();
            final XPath xpath = xPathfactory.newXPath();
            final TransformerFactory transformerFactory = TransformerFactory.newInstance();
            final Transformer transf = transformerFactory.newTransformer();
            transf.setOutputProperty("encoding", "UTF-8");
            transf.setOutputProperty("indent", "no");
            final XPath xPath;
            XPathExpression expr;
            final Object o;
            Node tobeDeleted;
            paths.forEach(path -> {
                try {
                    expr = xPath.compile(path);
                    tobeDeleted = (Node)expr.evaluate(o, XPathConstants.NODE);
                    tobeDeleted.getParentNode().removeChild(tobeDeleted);
                }
                catch (XPathExpressionException e) {
                    Utils.log.warn(invokedynamic(makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;, e.getMessage()));
                }
                return;
            });
            final DOMSource source = new DOMSource(doc);
            final StreamResult xmlOutput = new StreamResult(new StringWriter());
            transf.setOutputProperty("omit-xml-declaration", "yes");
            transf.transform(source, xmlOutput);
            return xmlOutput.getWriter().toString();
        }
        catch (Exception e2) {
            Utils.log.warn(invokedynamic(makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;, e2.getMessage()));
            return null;
        }
    }

my PH code to hash the content is :

$test_file = "tax_invoicenospaces.xml";
$test_file_read = file_get_contents($test_file);
  
$test_file_hash = hash_file("sha256", $test_file, false);

print("File Hash ($test_file_read): $test_file_hash");

If someone can explain it to me to help me to get the exact hashed.

Thanks,

system · July 31, 2022, 6:26pm

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.