Is there a multiline text processing tool?

  • File A has several HTML structures.
  • File B has this HTML structure:
    <footer class="site-footer">
      <div class="site-footer__inner container">
        {{ page.footer_top }}
        {{ page.footer_bottom }}
      </div>
    </footer>
  • File C has this HTML structure:
    <footer class="site-footer">
      <div class="site-footer__inner container">
        {{ page.footer_top }}
        {{ page.footer_bottom }}
      </div>
      <span class="globalrs_dynamic_year">{{ 'now' | date('Y') }}</span>
    </footer>

How to automatically search in file A and if it contains the text of file B then replacing that text with the text of file C (automatically means not in Vim or Nano but with some program to do this automatically)?

sed or awk are not fit for the job because they are meant for single-lined replacements and not for multi-lined replacements.

How would you do this with C/Perl/Python/PHP/Node.js or something else?

HTML is essentially an xml document, so you can serlalize it to XML and search, refine, etc using those methods.

Hi Dave !

In this case I must do the search (and possible replace) from CLUI because it’s done from cron every minute anew.
Why?
Well, each time I change the text manually in Drupal core and then update the Drupal core, the text change will be deleted, so I want to push the change each minute anew.

Of course, I can add and maintain some Twig code to my Drupal theme but for this particular textual change, I really want to avoid that.

I see you have Python - you can use https://docs.python.org/3/library/xml.etree.elementtree.html

This work for me.

import re

# Open the file you want to search
with open('a.html', 'r') as file:
    # Read the contents of the file
    contents_a = file.read()

with open('b.html', 'r') as file:
    # Read the contents of the file
    contents_b = file.read()

with open('c.html', 'r') as file:
    # Read the contents of the file
    contents_c = file.read()

# print(contents_a)

# Use the re.search() function to search for the pattern in the file contents
match = re.search(contents_b, contents_a)

# Check if a match was found
if match:
    print(f'Match found at position {match.start()}')
    contents_new = contents_a.replace(contents_b, contents_c)

    with open('a.html', 'w') as file:
        file.write(contents_new)

else:
    print('No match found')

Hi, that’s a Python code right? (I never wrote Python before).

1 Like

Yes is Python, but can be easily change to other languages as long as it supports Regular expression

1 Like

@Zensei what is the full meaning of

  • re
  • r
  • e

here?

re, is a python internal module for doing regular expression. So I guess the author called it re, as Regular Expression.

https://docs.python.org/3/library/re.html

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.