Heylo all, Okay it's not a webspider or anything like that *YET*. Right now it takes in a url/web page, makes a new instance of webClient and downloads the data (web page) into a string. string url = Text1.Value; WebClient browser = new WebClient(); UTF8Encoding enc = new UTF8Encoding(); string fContents = enc.GetString(browser.DownloadData(url)); int len = fContents.Length; char c; string linkList = ""; for (int i = 0; i < len; i++) { c = Convert.ToChar(fContents.Substring(i, 1)); if (c == 'a') { i++; c = Convert.ToChar(fContents.Substring(i, 1)); if (c == ' ') { i++; c = Convert.ToChar(fContents.Substring(i, 1)); if (c == 'h') { i = i + 6; // move our string counter to after the quotes ref="h c = Convert.ToChar(fContents.Substring(i, 1)); while (c != '"') { c = Convert.ToChar(fContents.Substring(i, 1)); if (c == '"') { break; } linkList = linkList + c; i++; } linkList = linkList + "\n"; TextArea1.Value = linkList; } } } } } Code (markup): As you can see you start off with the entire string and just go through it character by character. On some sites it works and it will display each link, however half the time it will fail and give an error of: Index and length must refer to a location within the string. The error occurs at THIS line: c = Convert.ToChar(fContents.Substring(i, 1)); <-----------errors here if (c == '"') Code (markup): Now I don't understand why this would be erroring. i is the position (character its on) in the string (or webpage) and 1 is the length of how many characters to put into my variable c. Thanks in advance for all help, much appreciated!!
You need to learn something about parsing. What you are doing is not top-down nor bottom-up parsing. You need to divide the task. Usually compilers etc. will use a scanner to divide the input stream into symbols, so that the parser rules are free to concern themselves with grammar without having to deal with individual characters. A scanner will also recognize whitespace correctly. Your attempt will fail on this text: a h an "a", two spaces and a "h".
Parsing requires a lot more work that doesn't need to be done with what I am trying to accomplish, thanks though! And if it finds an 'a', it then looks at the next character and if it is a space, it checks to see if the next character is an 'h'. If it is a space, it will just continue on with the loop looking for the next 'a'.
A couple of points then. You do not need to use the Substring method. It returns a string. If all you are interested in is character by character then just use the [index] indexer of the string. It will return the character in the position indicated by index (counted from 0 i believe). Your code will *not* just continue. You advance the index beyond what you know is safe. If the string ends right after an "a" you'll have an indexing error. At the very least you should guard the condition with a shortcut boolean and like if (i<fContents.Length && fContents=='a')
Thanks, I did change the code to not use substring, however, now I cannot connect to any remote website, kinda odd since it was not doing this before . If I somehow get it working I will see if the original error continues happening, thanks again! -Ferbal
Have you looked at using regular expressions? The RegEx class in C# is really good and really fast at finding patterns. The syntax is a little tricky at first, but once you get it, you can pass in a large string and it will return an array of all the strings that matched your pattern. It's very flexible and might work really well for what you are doing...
Yeah, I recently stumbled upon some stuff on Regex. However, I still need to be able to connect to websites first lol Thanks guys!