I have an XML file based on UBL 2.1 for ZATCA and I have the SDK file for developer to test their result and I have the hashed output and want to hash it in PHP, I tried to take the same XML content and hash it in PHP but the result is not the same, then after deep reading about XML and I tried to decompile the CLASS
files of SDK to understand how do they hash it! then I found these classes but I am trying to understand it:
public static String removeUnneededTags(final File file) {
final List<String> pathsToBeRemoved = new ArrayList<String>();
pathsToBeRemoved.add("//*[local-name()='Invoice']//*[local-name()='UBLExtensions']");
pathsToBeRemoved.add("//*[local-name()='Invoice']//*[local-name()='Signature']");
pathsToBeRemoved.add("//*[local-name()='Invoice']//*[local-name()='AdditionalDocumentReference']//*[text()= 'QR']/parent::*");
final String cleanXml = removeElmentListByXpath(file, pathsToBeRemoved);
final String removedNewlinesAndSpaces = cleanXml.replaceAll("[\\n\\t ]", "").replaceAll("\r", "");
return removedNewlinesAndSpaces;
}
what I understand is I have to remove UBLExtensions
and Signature
and AdditionalDocumentReference related to QR
elements plust I have to remove all new lines and spaces? \\n\\t
I already removed the elements and new lines and spaces to make the XML file as one line! that is what I understand, but still not get the same hashed output!
and I fund this class but didn’t get the idea from it:
public static String removeElmentListByXpath(final File file, final List<String> paths) {
try {
final String result = IOUtils.toString((Reader)new FileReader(file));
final DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
final DocumentBuilder db = dbf.newDocumentBuilder();
final org.w3c.dom.Document doc = db.parse(new ByteArrayInputStream(result.getBytes(StandardCharsets.UTF_8)));
final XPathFactory xPathfactory = XPathFactory.newInstance();
final XPath xpath = xPathfactory.newXPath();
final TransformerFactory transformerFactory = TransformerFactory.newInstance();
final Transformer transf = transformerFactory.newTransformer();
transf.setOutputProperty("encoding", "UTF-8");
transf.setOutputProperty("indent", "no");
final XPath xPath;
XPathExpression expr;
final Object o;
Node tobeDeleted;
paths.forEach(path -> {
try {
expr = xPath.compile(path);
tobeDeleted = (Node)expr.evaluate(o, XPathConstants.NODE);
tobeDeleted.getParentNode().removeChild(tobeDeleted);
}
catch (XPathExpressionException e) {
Utils.log.warn(invokedynamic(makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;, e.getMessage()));
}
return;
});
final DOMSource source = new DOMSource(doc);
final StreamResult xmlOutput = new StreamResult(new StringWriter());
transf.setOutputProperty("omit-xml-declaration", "yes");
transf.transform(source, xmlOutput);
return xmlOutput.getWriter().toString();
}
catch (Exception e2) {
Utils.log.warn(invokedynamic(makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;, e2.getMessage()));
return null;
}
}
my PH code to hash the content is :
$test_file = "tax_invoicenospaces.xml";
$test_file_read = file_get_contents($test_file);
$test_file_hash = hash_file("sha256", $test_file, false);
print("File Hash ($test_file_read): $test_file_hash");
If someone can explain it to me to help me to get the exact hashed.
Thanks,