Regex that sieves string size making odd results

I’m using a regex with curly braces to sort a 500kb text list according to string length… it takes the strings of longer and longer length and appends them to the end of the .txt file.

it works pretty well, but the file that goes in is 530kb and the sieved file is 590kb, it’s 600 lines longer.

It should be exactly the same, safe if there are duplcates.

here is the regex part of the code:

    create_function(
    '$element',
    'return 1 == preg_match("~msgid \\[I][B][U]"(.{11,20})[/U][/B][/I]\\"~", $element);'
    )

I’m a noob so i didnt loop it, but i replicated the code over and over using

{1,10}{11,20}{21,30}{31,40}{41,50} etc etc.

do you think it’s making duplicates? why? how can i make it stricter?

how do i tell the preg match to look for msgstr “…”…“…n”

where the last " is at the end of the line, prior to a linebreak?

I just tried to finish the phrase like “$~” but i dont get it yet.

preg_match("~msgid \\"(.{11,20})\\"~", $element);'

( i have figured out what it must be: )

My regex string is looking for “blablabla”

and some of the strings are like “blablabla"blablabla"blabla"blabla”

which means that the regex returns true multiple times…

SORRY! I need help! what is the correct syntax?

A = keyword "123"
B = keyword "123"4"567"
C = keyword "1234567"

preg_match("~keyword \\"(.{1,5})\\"~", $element);'

A and B return true, but C returns false.

How do i make only A return True? ( i tried the obvious)

preg_match(“~^keyword \”(.{1,5})\“(?!.)~”, $element);’

thanks! :slight_smile: