Extract string has letters and numbers from text

Hey guys,
I’m trying to extract some string by some conditions from a string and i’m not sure if it’s possible with one regular expression.
This is my string for example:

string bla bla bla sj4i2 text text

What i need to extract is letters and numbers in one word, in that case is “sj4i2”
It’s must contain letters and numbers, but can also contain “-_.$&”

What approach do i need to go here?
Thanks!

Have you started with anything yet?

This gets close. I’m not sure it gets you the whole way to where you want to go…and I’m sure there’s a more elegant way…

([a-z]+[0-9]+[-_.$&]*)
1 Like

I’m thinking about approach of grab all matches into array and then check them separately…not sure if there’s any shorter way…

@DaveMaxwell
Thank you, this is working only in this order “letters numbers signs” unfortunately i don’t know what order it will be.
Sorry i wasn’t clear about it.

If they can occur in any order it would be ([a-z0-9_.$&-]+)

1 Like

I tried that but that picks up bla as well…

Correct, i guess the only solution is more than 1 preg match?

Actually, I looked at that, and my solution doesn’t work either…it returns sj4 and i2 as separate matches… :frowning:

1 Like

This could be a problem. In order to be able to use regex to match a pattern, knowing would should be matched and what shouldn’t be matched is essential.

I’m assuming

  • the pattern will never match spaces
  • there will always be at least one letter and one numeric character
  • there may also so be certain “special” characters
  • the order and number of characters will vary

Can you post representative lists of possible correct and incorrect matches? eg.

ad5f, h7gyn, wt42g should match
5dl, kg4 should not match
etc.

1 Like

everything that includes numbers and letters is correct.
Not correct is number or letters separately.
For example not ok is: 11 22 sss fff ggg 00 - & @
Ok is: 1sd1 2sd03-sd 49s-5
etc…

1 Like

I’m showing these on separate lines so they’re easier to read

(
([0-9]+[a-z]+[a-z0-9_.$&-]*)
|([0-9]+[a-z0-9_.$&-]*[a-z]+)
|([a-z]+[0-9]+[a-z0-9_.$&-]*)
|([a-z]+[a-z0-9_.$&-]*[0-9]+)
|([a-z0-9_.$&-]*[0-9]+[a-z]+)
|([a-z0-9_.$&-]*[a-z]+[0-9]+)
)
  • must begin with 1 or more numbers, must be followed by 1 or more letters, may end with 0 or more mixed
  • must begin with 1 or more numbers, may be followed by 0 or more mixed, must end with 1 or more letters
  • must begin with 1 or more letters, must be followed by 1 or more numbers, may end with 0 or more mixed
  • must begin with 1 or more letters, may be followed by 0 or more mixed, must end with 1 or more numbers
  • may begin with 0 or more mixed, must be followed by 1 or more numbers, must end with 1 or more letters
  • may begin with 0 or more mixed, must be followed by 1 or more letters, must end with 1 or more numbers

The “may be mixed” means some sequences will match more than one sub pattern, but without getting into fancy lookbacks / lookaheads I think that should work.

2 Likes

First of all, thank you!
I tested it and it’s for some reason not working on any string i’ve tested…

Sorry, but “not working” is not helpful for determining what the problem might be.

Please post some examples of what is being matched that shouldn’t be getting matched or what isn’t getting matched that should be getting matched. Any error message(s) you are getting. A small bit of example code enough so others can see what’s happening. etc.

1 Like

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.