instr doesn't work for unicode (utf-8)

Discussion in 'C#' started by RockyMtnHi, Mar 1, 2008.

  1. #1
    I am using instr to search for strings from pages I have read into a variable. When the page uses "charset=UTF-8" then the comparison fails. Here is my code
    iPlace = inStr(1, sContent, sSearchTerm, vbTextCompare)
    Does anyone know how to do search for strings in unicode?

    Thanks
     
    RockyMtnHi, Mar 1, 2008 IP
  2. mintoj

    mintoj Peon

    Messages:
    317
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Have you tried a binary comparison instead of a text comparison?
     
    mintoj, Mar 2, 2008 IP
  3. RockyMtnHi

    RockyMtnHi Active Member

    Messages:
    211
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    53
    #3
    mintoj,
    Yes, here is the entire set of code. Note that the RegExp option fails to detect text the same as the instr option. IU am trying to detect a url on a page. Typical call is:
    DoesLinkExist("www.apageoutthere.com", "www.adomain.com")
    Also note that the UnicodeTools object executes unicode versions of vbscript commands, but I can't get that to find the text either.

    The bottom line here is that when the sDomain text actually exists in sPage it isn't detected.

    Thanks for your input...
    --
    ==============================
    function DoesLinkExist(sPage, sDomain)
    dim xmlcontent
    sErr = ""
    set xmlcontent = CreateObject("MSXML2.ServerXMLHTTP")

    on error resume next
    xmlcontent.open "GET", "http://" & sPage, false
    xmlcontent.send ""
    status = xmlcontent.status

    if err.number <> 0 or status <> 200 then
    sErr = "PAGE NOT FOUND, Status = " & status​
    else
    sContent = xmlcontent.responseText
    sSearch = sDomain

    bRegExp = false
    if bRegExp then
    'Search for text using Regular Expressions
    Set myRegExp = New RegExp
    myRegExp.IgnoreCase = True
    myRegExp.Global = True
    myRegExp.Pattern = sSearch
    ' write out msg for each match
    Set myMatches = myRegExp.Execute(sContent)
    For Each myMatch in myMatches
    sMatches = sMatches & myMatch.Value & ", "
    Next
    if sMatches = "" then
    sErr = "NO Regex MATCH"
    else
    sErr = ""
    end if​
    else

    ' search for text using vbscript instr
    iPlace = inStr(1, sContent, sSearch, vbTextCompare)
    if isNull(iPlace) or iPlace = 0 then
    Set u = New UnicodeTools
    sSearchUni = u.CStrU(sSearch)
    sContentUni = u.CStrU(sContent)
    iPlace = u.inStrU(1, sContentUni, sSearchUni, vbBinaryCompare)
    if isNull(iPlace) or iPlace = 0 then
    sErr = sSearch & " NO Unicode Instr MATCH"​
    end if
    Set u = Nothing​
    end if​
    end if ​
    end if

    set xmlcontent = nothing
    DoesLinkExist = sErr
    end function
     
    RockyMtnHi, Mar 2, 2008 IP