Help with regex please

Hi.
I am using universal-ctags to create a tags list of my project. Then I’m using Vim to “go to definition”. The definition type I’m attempting to go to is a CSS class.

I have created the tags file no problem. It contains entries like:
.class1 …
.class2 …
etc.

In Vim, I should be able to press “Ctrl-]” with my cursor on a classname in an HTML file and “jump” to the CSS file containing that class. This works fine for other things like PHP functions etc.

It doesn’t work for CSS classes. I even know why. It doesn’t work because Ctrl-] looks for the class name… not the class name plus preceding dot.

Now, I have Googled this issue and I’m not alone. Others have suggested I place some regex patterns/rules in the ctags config which are SUPPOSED to make ctags generate a list of class names WITHOUT preceding dots.

The person with the problem initially had these lines in hist config:

--regex-scss=/^([A-Za-z0-9_-]*)*(\.[A-Za-z0-9_-]+) *[,{]/\2/c,class,classes/
--regex-scss=/^[ \t]+(\.[A-Za-z0-9_-]+) *[,{]/\1/c,class,classes/

He was advised to change these lines to:

--regex-scss=/^([A-Za-z0-9_-]*)*\.([A-Za-z0-9_-]+) *[,{]/\2/c,class,classes/
--regex-scss=/^[ \t]+\.([A-Za-z0-9_-]+) *[,{]/\1/c,class,classes/

to force dotless entries in the tags file.

Here’s the config he came up with (in full):

--exclude=*.min.js
--exclude=*.min.css
--exclude=*.map
--exclude=.backup
--exclude=.sass-cache
--exclude=vendors
--exclude=.git

--langdef=css
--langmap=css:.css
--langmap=css:+.styl
--langmap=css:+.less
--regex-css=/^[ \t]*\.([A-Za-z0-9_-]+)/\1/c,class,classes/
--regex-css=/^[ \t]*#([A-Za-z0-9_-]+)/\1/i,id,ids/
--regex-css=/^[ \t]*(([A-Za-z0-9_-]+[ \t\n,]+)+)\{/\1/t,tag,tags/
--regex-css=/^[ \t]*@media\s+([A-Za-z0-9_-]+)/\1/m,media,medias/

--langdef=scss
--langmap=scss:.sass
--langmap=scss:+.scss
--regex-scss=/^[ \t]*@mixin ([A-Za-z0-9_-]+)/\1/m,mixin,mixins/
--regex-scss=/^[ \t]*@function ([A-Za-z0-9_-]+)/\1/f,function,functions/
--regex-scss=/^[ \t]*\$([A-Za-z0-9_-]+)/\1/v,variable,variables/
--regex-scss=/^([A-Za-z0-9_-]*)*\.([A-Za-z0-9_-]+) *[,{]/\2/c,class,classes/
--regex-scss=/^[ \t]+\.([A-Za-z0-9_-]+) *[,{]/\1/c,class,classes/
--regex-scss=/^(.*)*\#([A-Za-z0-9_-]+) *[,{]/\2/i,id,ids/
--regex-scss=/^[ \t]*#([A-Za-z0-9_-]+)/\1/i,id,ids/
--regex-scss=/(^([A-Za-z0-9_-])*([A-Za-z0-9_-]+)) *[,|\{]/\1/t,tag,tags/
--regex-scss=/(^([^\/\/])*)[ \t]+([A-Za-z0-9_-]+)) *[,|\{]/\3/t,tag,tags/
--regex-scss=/(^(.*, *)([A-Za-z0-9_-]+)) *[,|\{]/\3/t,tag,tags/
--regex-scss=/(^[ \t]+([A-Za-z0-9_-]+)) *[,|\{]/\1/t,tag,tags/
--regex-scss=/^[ \t]*@media\s+([A-Za-z0-9_-]+)/\1/d,media,media/

Although the person says this now works for him, it doesn’t for me (and the thread is closed). Generation of the tags file still has entries WITH preceding dots and, hence, Ctrl-] fails to locate them.

Can anyone, with some regex expertise, help with this, please?

Why might it not work?

Also, the “two-line” code blocks above only mention SCSS rules. Are the entries in the “full” code block referencing CSS rules correct?

Thanks in advance for an help.

Mike

I dont know ctags, so i’m gonna ignore all that part, and instead i’m looking just at the match pattern.

^([A-Za-z0-9_-]*)*\.([A-Za-z0-9_-]+) *[,{]
This pattern reads as:
From the start of the line:
Star Capture Group 1
Any number of characters in the set [A-Z, a-z, 0-9, an Underscore or a Hyphen] , captured greedily.
End Capture Group 1, as many times as possible (Though in Greedy mode, this will only ever be 1)
Start Capture Group 2
Followed by a period.
Followed by 1 or more characters in the set [A-Z, a-z, 0-9, an Underscore or a Hyphen]
End Capture Group 2
Followed by any number of spaces
Followed by either a comma or a Open Curly Brace.

The second pattern (which normally is a ‘replace’ pattern for regex searches…), is simply Whatever was in Capture Group 2.

Now, again, I dont know your target source, but i’m pretty sure in standard format, the class doesnt exist at the start of the line.

Are you absolutely sure there wasnt another space somewhere in the first capture group? Cause… it feels to me like there should have been. Like… between the first * and the ) ?

I was wondering if it was something I didn’t know about. To me the “([]*)*” bit seems redundant and reads like:
“zero or more of these characters, zero or more times”

And I’m not seeing the reason for capturing the sub-pattern and not using it as opposed to only matching and capturing the used pattern. (a different “flavor” of regex that doesn’t have non-capture matching?) If there is more involved it would be helpful to at least mention it now rather than later.

If there was a space between the first * and the ), the pattern grabbing makes some sense, because it’s essentially grabbing all the words from before the period and throwing them away.

It would be more efficient to make it a non-capturing group, but at least the pattern would make sense.

HTML and CSS experts say to (except for very trivial tasks) not use Regexes for HTML and CSS. Use a parser designed for such things. There are many available; every browser requires one.

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.