THOUSANDS OF SITE BEING HIJACKED!! Is yours?

NetMidWest Peon

Messages:: 1,677

Likes Received:: 151

Best Answers:: 0

Trophy Points:: 0

#61

MikeSwede said: ↑

Maybe they'll be PhD's one day and work for Google

If you use "their" Google page (google.com.cob-web.org:8888) and you do a site:cob-web.org it is not showing anything from their site. It rewrites the query to site%3 !!!!
Now, why would you do that if you don't have anything to hide?
Click to expand...

That is probably just a mistake in the coding, where the url is injected into the links. Maybe it thinks there is a mistake on the page, or it is trying not to get caught in a loop.

NetMidWest, Jan 21, 2007 IP

cormac Peon

Messages:: 3,662

Likes Received:: 222

Best Answers:: 0

Trophy Points:: 0

#62

NeoGen said: ↑

Has anyone seen this:

http://www.cs.cornell.edu/people/egs/beehive/cobweb/sites.php

CobWeb is a next-generation content distribution network that can deliver web pages quickly to clients through a peer-to-peer cache. It uses the Beehive replication framework to provide a high performance web hosting service while consuming near-minimal storage and bandwidth at each participating node.

Really confused over this..can't figure out this cob-web
Click to expand...

Hmm I wonder if there is a connection with this post by Matt Cutts on Crawl Caching Proxy

cormac, Jan 21, 2007 IP

MikeSwede Peon

Messages:: 601

Likes Received:: 16

Best Answers:: 0

Trophy Points:: 0

#63

I got an email from a guy saying that he contacted them and they told him he could "opt out" and they would block his URL. I don't think this is right and they might get a lot of problems with gov sites and other sites link FBI
Update myself: guess fbi.gov has an Access denied page already.....
So that raises the question. Do they KNOW they'll be in deep shit if fbi.gov showed up? whitehouse.gov is till there.....

MikeSwede, Jan 21, 2007 IP

cormac Peon

Messages:: 3,662

Likes Received:: 222

Best Answers:: 0

Trophy Points:: 0

#64

MikeSwede said: ↑

I got an email from a guy saying that he contacted them and they told him he could "opt out" and they would block his URL. I don't think this is right and they might get a lot of problems with gov sites and other sites link FBI
Click to expand...

I noticed the same earlier. Some of those urls have been removed and I was thinking it was at the request from the sites owner.

cormac, Jan 21, 2007 IP

MikeSwede Peon

Messages:: 601

Likes Received:: 16

Best Answers:: 0

Trophy Points:: 0

#65

NetMidWest said: ↑

Following the links on the left side of the page brings up another quantcast url with the stats on a cob-web.org proxied page, and a link to it...
Click to expand...

What is more interesting about the links to the left and the Title of it is that it says: TOP SUB DOMAINS and then other sites are listed as sub domains to cob web.
So it SEEMS that quantcast is seeing it as sub domains!

MikeSwede, Jan 21, 2007 IP

MikeSwede Peon

Messages:: 601

Likes Received:: 16

Best Answers:: 0

Trophy Points:: 0

#66

MikeSwede said: ↑

AND an UPDATE!
Google IS indexing the sites! http://www.google.com/search?hl=en&lr=&q="south+pole+clothes"
Click to expand...

yeah, i'll quote myself

The above search bring up the cob web "sub directory" of a site that is not even mentined in Googles index!
Wonder if this guy/girl is happy about that or not? Do they get more hits? Did they get hit by a penaly?

MikeSwede, Jan 21, 2007 IP

NetMidWest Peon

Messages:: 1,677

Likes Received:: 151

Best Answers:: 0

Trophy Points:: 0

#67

MikeSwede said: ↑

The above search bring up the cob web "sub directory" of a site that is not even mentined in Googles index!
Wonder if this guy/girl is happy about that or not? Do they get more hits? Did they get hit by a penaly?
Click to expand...

If we're looking at the same thing (cleanclothesconnection.org?) that is cache of a domain that is parked with a 302 redirect to what looks to be a server for an internet consulting/service company. There is no content, and the proxy follows the redirect too.
It maybe a site that was taken down:
http://www.google.com/search?q=site:cleanclothesconnection.org
The cache of those pages is pretty old.

'south pole clothes' shows as the title because that is the anchor text on a page somewhere to the site, see the 'try it' example in my other post...

I don't think they're worried.

NetMidWest, Jan 21, 2007 IP

cormac Peon

Messages:: 3,662

Likes Received:: 222

Best Answers:: 0

Trophy Points:: 0

#68

I did a post about this after reading through the thread and was interested to see how Google is actually a partner of these projects.

Feel free to take a look at the Google Blunder. If I am missing anything drop me a line or blog it and send me the URL as I have lots more reading up to do on the subject.

cormac, Jan 21, 2007 IP

Smyrl likes this.

reapr Peon

Messages:: 1,711

Likes Received:: 18

Best Answers:: 0

Trophy Points:: 0

#69

Last time I contacted an edu about a student utilizing thier servers to do well with a college affiliate program it was shocking to get a responce that the webmaster saw no issue with it ... I finnaly gave up. At least the taxpayers dollars was put to good worK?

reapr, Jan 21, 2007 IP

MikeSwede Peon

Messages:: 601

Likes Received:: 16

Best Answers:: 0

Trophy Points:: 0

#70

turfsniffer said: ↑

I did a post about this after reading through the thread and was interested to see how Google is actually a partner of these projects.

Feel free to take a look at the Google Blunder. If I am missing anything drop me a line or blog it and send me the URL as I have lots more reading up to do on the subject.
Click to expand...

This must be something new and I don't know if the new Google downgrade was a result of them indexing that site or that sites showed up because of the downgrade? Hen and the egg, huh?
Still wondering if Google treat the site as having tons of sub-domains and of Google treats it as duplicate content and penalized sites for it......

MikeSwede, Jan 21, 2007 IP

cormac Peon

Messages:: 3,662

Likes Received:: 222

Best Answers:: 0

Trophy Points:: 0

#71

MikeSwede said: ↑

Still wondering if Google treat the site as having tons of sub-domains and of Google treats it as duplicate content and penalized sites for it......
Click to expand...

Google can and will see sites with this issue as duplicate content. There are people out there that have been slammed from Google because of the proxy site.

I still have more to read up on so hopefully I can find as much info as possible and make more sense of it all. Google is very much involved in those projects but how their bots are acting with the nodes is another thing.

cormac, Jan 21, 2007 IP

Libertate Guest

Messages:: 342

Likes Received:: 14

Best Answers:: 0

Trophy Points:: 0

#72

STOP!

Good Lord! Have any of you looked to http://www.cob-web.org and read, at least the front page?

It's a project WITH Google, (and Yahoo!, and more) through Cornell. It's is 'just' a distributed "web of caching proxy servers"... Not scrapers, not hackers, not aliens with green skin and lousy voting record trying to steal your 2 pence of Adsense.

Please, for the love of G*d, read up on stuff before start running in circles and screaming, hands in the air...

Libertate, Jan 21, 2007 IP

NetMidWest likes this.

cormac Peon

Messages:: 3,662

Likes Received:: 222

Best Answers:: 0

Trophy Points:: 0

#73

Libertate said: ↑

STOP!

Good Lord! Have any of you looked to http://www.cob-web.org and read, at least the front page?
Click to expand...

No we all decided just to skip reading. Have you actually read this full thread?

cormac, Jan 21, 2007 IP

NetMidWest Peon

Messages:: 1,677

Likes Received:: 151

Best Answers:: 0

Trophy Points:: 0

#74

turfsniffer said: ↑

Google can and will see sites with this issue as duplicate content. There are people out there that have been slammed from Google because of the proxy site.
Click to expand...

Could you give examples, please?
I cannot see how this would happen without Google caching the site...

NetMidWest, Jan 22, 2007 IP

trichnosis Prominent Member

Messages:: 13,785

Likes Received:: 333

Best Answers:: 0

Trophy Points:: 300

#75

i have 2 web sites and none of the is hijacked yet

trichnosis, Jan 22, 2007 IP

cormac Peon

Messages:: 3,662

Likes Received:: 222

Best Answers:: 0

Trophy Points:: 0

#76

NetMidWest said: ↑

Could you give examples, please?
I cannot see how this would happen without Google caching the site...
Click to expand...

Search on Google or better yet look in WMW and you'll find a good few threads on it.

cormac, Jan 22, 2007 IP

MikeSwede Peon

Messages:: 601

Likes Received:: 16

Best Answers:: 0

Trophy Points:: 0

#77

Libertate said: ↑

STOP!

Good Lord! Have any of you looked to http://www.cob-web.org and read, at least the front page?

It's a project WITH Google, (and Yahoo!, and more) through Cornell. It's is 'just' a distributed "web of caching proxy servers"... Not scrapers, not hackers, not aliens with green skin and lousy voting record trying to steal your 2 pence of Adsense.

Please, for the love of G*d, read up on stuff before start running in circles and screaming, hands in the air...
Click to expand...

are they indexing caching servers? Is their algo clever enough to exclude "sites" from a caching server and don't penalize site for duplicate content? I have seen links where people actually use the cob web url instead of the real url. How's this going to affect how Google and other SE's index it?

I haven't seen things like this before so I am wondering if this is an experiment that just went wrong or is it a start of something new where Google won't crawl your site anymore? All they need to do is to go to a caching server and get your pages.
Just speculating.... Anyway, I don't expect to see results from a caching server in the SERP's. Do you?

MikeSwede, Jan 22, 2007 IP

NetMidWest Peon

Messages:: 1,677

Likes Received:: 151

Best Answers:: 0

Trophy Points:: 0

#78

turfsniffer said: ↑

Search on Google or better yet look in WMW and you'll find a good few threads on it.
Click to expand...

I can tell from your article we have read some of the same material.

I am not saying that proxies are not a problem, I am saying that THIS proxy is not a problem...

I would be more worried about this:
http://72.14.209.104/search?q=cache:theopaqueproject.com/event/nph-proxy.pl/010110A/http/www.tv.com/
But Google has been better about such things of late.

I have found one cached cob-webbed page. My best guess is that there was a configuration error that allowed it. Since it seems to be so rare, my bet is that the error was on the target server, and not the proxy server.

Now, I did mention that some of these urls have taken the anchor text as the title rather than the url. The site: search from the OP shows some of these. I can see where they may lead to an extra entry for a keyword phrase because of the way Google is handling things. But again, without the cache, it cannot be seen as duplicate content, it is not on the same domain, so no penalty to the site due to the cob-webbed url.

The using of the anchor text by Google as the title of the page does show a sign of Google reversing direction on the 302 hijack fixes, however. This fact bothers me, it is a sign of that redirect bug.

I can't find one ranking for any keywords of real value, but here is an example of a cob-web url ranking:
http://www.google.com/search?q=tight+shiny+clothes
I can't see the advantage of the phrase tight shiny clothes to that site, or why it may have been anchored with those words. It would not rank for the phrase otherwise, and merely mentioning tight shiny clothes in this thread would probably cause this thread to outrank it. It is basically an extra entry. The proxy breaks forms, I assume it breaks adsense, I don't understand the advantage to making that url rank on purpose. I don't think it would rank at all if the phrase were more competitive, since the text does not include all the words, there is no cache, no description, no keywords, etc. for Google to go by...

NetMidWest, Jan 22, 2007 IP

MikeSwede Peon

Messages:: 601

Likes Received:: 16

Best Answers:: 0

Trophy Points:: 0

#79

NetMidWest said: ↑

I can tell from your article we have read some of the same material.

I am not saying that proxies are not a problem, I am saying that THIS proxy is not a problem...
..
Click to expand...

the opaqeserver is a problem because it seems to do illegal stuff!

MikeSwede, Jan 22, 2007 IP

Libertate Guest

Messages:: 342

Likes Received:: 14

Best Answers:: 0

Trophy Points:: 0

#80

MikeSwede said: ↑

[...] How's this going to affect how Google and other SE's index it?

[...] if this is an experiment that just went wrong or is it a start of something new where Google won't crawl your site anymore? [...]
Click to expand...

This would be interesting. I wonder how they would deal with staleness, as you mentioned.

turfsniffer said:

Have you actually read this full thread?
Click to expand...

Yes.

I have asked the Beehive project to give me some feedback to concerns mentioned here.

Libertate, Jan 22, 2007 IP

Log in or Sign up

THOUSANDS OF SITE BEING HIJACKED!! Is yours?

NetMidWest Peon

cormac Peon

MikeSwede Peon

cormac Peon

MikeSwede Peon

MikeSwede Peon

NetMidWest Peon

cormac Peon

reapr Peon

MikeSwede Peon

cormac Peon

Libertate Guest

cormac Peon

NetMidWest Peon

trichnosis Prominent Member

cormac Peon

MikeSwede Peon

NetMidWest Peon

MikeSwede Peon

Libertate Guest

Useful Searches