Hi, I recently needed to parse text via JS and I found rather weird behavior of match() method. More exactly when using global match flag together with capturing parentheses. I didn't find any examples of this, and I am not sure if it is something that should be called JS bug, or rather my misunderstanding. This is just silly example, and it should match all alphanumeric characters between 'd' and 'tal', in this case 'igi'. var txt = 'digital point in digital world'; var matches = txt.match(/d(\w+)tal/gi); var match_substring = ?????; Code (markup): I simply don't know how can I call the match substring from matches. If I don't use global flag, it is accessible via matches[1], but with global match there is no index with the captured substring. I expected, it should add new dimension to array of matches, so I could call it matches[0][0], matches[1][0] or similar, but nothing like that works. Does anyone know why is JS doing so? Is there a way to get the captured substrings without need of walking through the array of matches and calling again the same regular expression without global flag? Any hints are welcome. Thanks
Hey, did you ever get an answer to this, I'm banging my head with the exact same problem. LMK! Thanks!
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head> <title>None</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <script type="text/javascript"> function init(){ // match all alphanumeric characters between 'd' and 'tal'; var nStr = "digital point in diginewtexttal world, d45567tal"; var nResult = []; var nMatch = ""; while (/d\w+tal/.test(nStr)) { nMatch = nStr.match(/d\w+tal/); nResult[nResult.length] = nMatch.toString().replace(/(d)(\w+)(tal)/, "$2"); nStr = nStr.replace(/d\w+tal/, "").replace(/\s{2,}/, " "); } alert(nResult); alert(nResult.length); } navigator.appName == "Microsoft Internet Explorer" ? attachEvent('onload', init, false) : addEventListener('load', init, false); </script> </head> <body> </body> </html> Code (markup):
Look at link I wrote a prototype to retrieve submatches (I call $1element) in javascript http://www.pagecolumn.com/javascript/exec_extension.htm You can also test submatches in the online js regexp tester using $1element method
Thanks a lot guys, Mike H. - yes, this is possible, but I was trying to find other way than looping the matches. Also I believe, it is easier and faster to do : var nStr = "digital point in diginewtexttal world, d45567tal"; var nResult = []; var nMatch = nStr.match(/d(\w+)tal/g); for(var i=0, l=nMatch?nMatch.length:0; i<l; i++) { if(nMatch[i]) { nResult.push(nMatch[i].match(/d(\w+)tal/)[1]); } } alert(nResult); Code (markup): unigogo - yes, this was the answer I was looking for. It is funny, I am using similar method for converting CSS style text into JS style properies, but never thought about it as a possible way to go with matches Thanks for the hint! Maybe you can find it useful too, so here it is : var s = "digital point in diginewtexttal world, d45567tal"; var res = []; for(var exp=/d(\w+)tal/; exp.test(s); s=s.replace(exp, function(){res.push(RegExp.$1); return RegExp.$0})); alert(res); Code (markup): But there is another problem - $1, ..., $9 for parenthesized substring matches are deprecated as of JavaScript 1.5 (here) and I didn't find yet the new specification. So if anybody know what is the expected future way how to get the submatches from RegExp object, it would be great to share it Btw. unigogo, I think you are escaping too much. Why not use only String.prototype.$1elements=function(vregex) { var elm=[]; var str=this; var re= new RegExp(vregex, "g"); str = str.replace(re, function($0,$1) { elm.push($1) return $0; }); return elm; }; var str = "border<-top<-width".$1elements("<-(\\w+)"); alert(str); Code (markup):
Yes, lp1051, I should delete the last part of the article. Seems I got confused the match and replace method in dealing with HTML tags.
lp1051, Look at the examples on Mozilla site. They still use $1 and $2 in replace method. I think the deprecated $1, $2...$9 in the link here should be Regexp.$1, Regexp.$2 ... Regexp.$9