Ok, my MFC project is to create a main window with two menu bars (Host and About) Host goes to options, where you specify what kinds of links you want displayed (asp, html, pdf, jpeg...etc). Then you can go to Hostname and enter a URL. Using windows connection methods, it connects to a URL and gets all the HTML from it and puts it in a string. Then it parses through the HTML to find the kinds of links the user requested (and as I do, puts it into a set). Then it creates a list box and prints out the URLs like that (along with a static box, but that's not my problem). (All of the html, jpeg, etc variable are ones that identify whether a box has been checked for to look for that specific link).
My main two problems are these.
1. I will search for one URL fine and print out links. Though when I search for another right after, the listbox will freeze and if you click on a newly found URL, it'll start making the ones from the previous search show up. Which is confusing to me, since my list box is a local variable in one method, which means that should be no problem to me, but it is.
2. If I search for no kinds of links, I search for everything fine. (I'll get basic links--any without the extensions of -- jpeg,gif,html,htm,pdf,txt,mailto,javascr ipt,asp, and aspx) If I add to just for one kind of link as well as default links, it will then just add every link possible to the list, and it just really frustrates me, because I have everything set up to work (as far as I know).
(All the 'if' statements look for that extension, then the last one basically says if none of those are found, it's a link I want to basically print out always. When I used MessageBoxes to check progress, it seems when I pick a choice every link seems to believe "mailto" is in it's code, so it appends every link to my set. And I just tested to make sure my flags were set correctly, and they were)
This is the method that has everything I'm dealing with focused on.
If anything else is needed to see to maybe pinpoint the problem, lemme know.
Sorry if I bug anyone, I'm just extremely frustrated with this project now.
My main two problems are these.
1. I will search for one URL fine and print out links. Though when I search for another right after, the listbox will freeze and if you click on a newly found URL, it'll start making the ones from the previous search show up. Which is confusing to me, since my list box is a local variable in one method, which means that should be no problem to me, but it is.
2. If I search for no kinds of links, I search for everything fine. (I'll get basic links--any without the extensions of -- jpeg,gif,html,htm,pdf,txt,mailto,javascr
(All the 'if' statements look for that extension, then the last one basically says if none of those are found, it's a link I want to basically print out always. When I used MessageBoxes to check progress, it seems when I pick a choice every link seems to believe "mailto" is in it's code, so it appends every link to my set. And I just tested to make sure my flags were set correctly, and they were)
This is the method that has everything I'm dealing with focused on.
void CMenuWindow::createList(string hostname, string answer){
int loc, loc1, loc2;
string str;
set not_need;
CListBox *linkbox;
linkbox = new CListBox;
linkbox->Create( WS_CHILD | WS_VSCROLL | WS_VISIBLE | WS_BORDER, CRect( 50,200,700,400 ),
this, 1);
set good_urls;
if (url[url.length()-1] != '/'){
url = url + '/';
}
while ( ( (loc1=answer.find("href=\"")) != string::npos) ||
( (loc2=answer.find("href=\'")) != string::npos) ) {
if (loc1 != string::npos)
loc = loc1;
else
loc = loc2;
// erase the first href=" (or href=') part in answer
answer.erase(0,loc+6);
// find the closing quote (either ' or ")
if (loc1 != string::npos)
loc = answer.find("\"");
else
loc = answer.find("\'");
// extract the URL itself
str = answer.substr(0,loc);
// erase the URL from the answer string
answer.erase(0,loc);
// check the URL to make sure we are interested in it
if ( validateURL(str) ) {
// if valid, get rid of any "#" references in URL (starting other than top of page)
loc = str.find("#", 0);
if (loc != string::npos)
str.erase(loc);
// if valid, get rid of any "?" references in URL (passing parameters)
loc = str.find("?", 0);
if (loc != string::npos)
str.erase(loc);
// if valid, make sure we built it in the form http://hosthame/filename
loc = str.find("http://", 0);
if (loc == string::npos) {
loc = str.find("https://", 0);
// if it does not start with http:// then add this information
if (loc == string::npos) {
if (str.length() == 0)
str = "http://" + hostname;
else if (str.at(0) == '/')
str = "http://" + hostname + str;
else
str = "http://" + hostname + "/" + str;
}
}
// append the / to the end of the URL as needed
if ( missingSuffix(str) )
str = str + "/";
if ((str.find("mailto") && ( mailto == 1)))
good_urls.insert(str);
if ((str.find(".jpeg") && ( jpeg == 1)))
good_urls.insert(str);
if ((str.find(".gif") && ( jpeg == 1)))
good_urls.insert(str);
if ((str.find(".asp") && ( asp == 1)))
good_urls.insert(str);
if ((str.find(".aspx") && ( asp == 1)))
good_urls.insert(str);
if ((str.find(".txt") && ( txt == 1)))
good_urls.insert(str);
if ((str.find(".pdf") && ( pdf == 1)))
good_urls.insert(str);
if ((str.find(".htm") && ( html == 1)))
good_urls.insert(str);
if ((str.find(".html") && ( html == 1)))
good_urls.insert(str);
if ((str.find("javascript") && ( java == 1)))
good_urls.insert(str);
if (((str.find("javascript") == -1)) && ((str.find("mailto") == -1)) && ((str.find(".jpeg") == -1)) && ((str.find(".gif") == -1)) && ((str.find(".txt") == -1)) && ((str.find(".asp") == -1)) && ((str.find(".aspx") == -1)) && ((str.find(".pdf") == -1)) && ((str.find(".htm") == -1)) && ((str.find(".html") == -1)))
good_urls.insert(str);
}
}
set::iterator check;
string checking;
for (check = good_urls.begin(); check != good_urls.end(); ++check){
checking = *check;
linkbox->AddString(checking.c_str());
}
}
//////////////////////////////////////////////////////////////////////////////////// If anything else is needed to see to maybe pinpoint the problem, lemme know.
Sorry if I bug anyone, I'm just extremely frustrated with this project now.
