Google, Plagiarizing, and "The Little Guy" - Part 2
Ken's Blog
In my previous post, we talked about "gray-zone" plagiarizing and an important public commitment by Google to protect the Web against those who would profit from the labor of others.
In today's post, we will explore Google's commitment and emerge with action steps for you to take, if you are a victim of plagiarizing, to finally get that thorn in your side removed.
Google made an exciting commitment. However, Google's history is one of saying one thing and doing another, especially when it comes to the solo e-business proprietor, including a recent example that came directly to me.
Google's refusal to de-index that site was disturbing...
Disturbing at Three Levels
The first level of disappointment in Google is straightforward. Google recently launched its "Farmer/Panda" algorithm. The "100% pap content" of this paraphrasing site fooled Google's vaunted "Farmer/Panda" algorithm.
That algorithm was meant to rid search results of this type of site. It failed. How good is this new algorithm, really?
I would never ask that question based a single anecdote. However, in recent posts, I have written about...
The "Farmer/Panda" algorithm. We call it the "Big Pap Attack." A follow up post talked about...
The False-Positive problem and how Google solves these errors depending on who you are and how much negative publicity you may generate for them. The problems go beyond false-positives...
The False-Negative problem and how ehow.com's increasing traffic (after the Farmer release) casts doubts about the entire Farmer/Panda/Pap algorithm and its ability to detect pap.
The second "disturbing" level speaks to the paraphraser's need to avoid detection at Google. Google not only missed the copycat site at the algorithmic level, "Google humans" refused to do anything about a site that was clearly derivative when it was submitted to Google DMCA.
Careful paraphrasers know that Google will not assign humans to actually read and compare two sites carefully. Why? "Because it does not scale."
That is a fancy way of saying, "We don't want to spend any of our billions of dollars of profits on doing the right thing." Speaking of those billions...
Here is the third and highest level of disappointment...
Google has attained a near-monopoly control of search. They earn billions per quarter thanks to that service. Whether they like it or not, that domination has consequences...
Algorithmic changes can wrongly make (ex., ehow.com) or break sites. They seem to manually whitelist certain "special" sites to avoid negative publicity.
But what about the rest of us?
We get the old "sorry it's not our fault, it's the algorithm" bogus excuse. For example, the number of sites impacted by Google's Farmer algorithm just keeps growing (approaching 1400 sites -- remember, most sites don't know about this or don't bother, so the actual number is much higher).
These e-businesses, built largely by solo proprietors, are negatively impacted due to errors in Google's new algorithmic. Google makes no effort to repair the damage wrought by it.
Any company with Google's impact in its field has a moral and social obligation to treat these businesses fairly. Don't get me wrong... Google owes them no favors. They do not owe anyone a living.
But they do owe fairness to hard-working e-businesses that add value to those who use Google Search. This is true for all sites, and especially true for the majority that are also Google partners (by way of AdSense). Partners expect to be treated fairly by their partners. But...
Decision-making seems to be press-driven. Decisions to whitelist seem to be based upon "who" you are, not "if" you deserve to be treated fairly. Now that Google has admitted the existence of the whitelist, they have a moral obligation of fairness to all, not just those who get their story into the media.
Can We Trust Anything Google Says?
I recently wrote about Google's admission of a Whitelist and Blacklist... after 10 years of denying this. This came after two incidents (explained in previous posts) that forced their hand. They had to admit it. Self-preservation seems to drive Google, not a sense of fair play.
An inescapable conclusion is that Google has spent its credibility. I used to trust Google and what it said. Not anymore, not if it involves the average small e-business person. My trust level is zero. Now I parse everything they say carefully.
Don't expect them to use this whitelist to right the wrongs that so many small e-businesses suffer at the hands of Google. There's no "PR" in it for them.
"It doesn't scale."
How Does Apply To Google and Plagiarism?
Skilled "careful paraphrasers" know all about Google, too. Not only does the Farmer algorithm miss sites filled with pap (false-negatives), careful paraphrasers know that Google won't put in the human effort to figure out what they are doing, too. Why?
Because it takes time to figure out that they are plagiarizing. It takes "human time" to analyze "gray zone" sites and then do the right thing and de-index them.
If Google were to spend a tiny fraction of their profits to create a program of human review, similar to the one I proposed to help good sites damaged by Google, plagiarists would virtually disappear.
Google does spend that type of time and money to protect major media companies at YouTube, but will they go beyond the obvious "word-for-word" copies for the average small e-business at Google Search? I hope so. Let's see.
Right Now, The "Gray Zone" Is Getting Away With It
I can't count the number of times that I've read, both in SBI! forums and other ones, about Google doing nothing about the gray zone. By "gray zone," I'm not talking about those who research a niche intensively and then write a site from scratch.
This is about those people who find sites that are successful in their niche. Why mess with success? They retain the site structure, the content, and the organization of the content (i.e., the site architecture). They merely re-word the sentences to make the content look different.
When someone's work is copied, it's like a punch-in-the-stomach. You feel violated. Your hard work—your excellent blood, sweat and tears—is now copied and exists elsewhere.
It's wrong at a legal level (but most small e-businesses have neither the time nor resources to pursue this avenue). It's wrong on a moral level. And it's beyond-wrong at the personal level. It's gut-wrenching.
It should also be wrong at the "Google level" of not adding value to the Web. These sites are pap (regurgitated content). They evade Google's Farmer algorithm, but any human evaluation would conclude, "this is wrong." These carefully copied sites would be de-indexed after any human review.
Those who are violated by gray-zone plagiarists have been left hoping that Google DMCA will do the right thing. So far, Google has done nothing about them.
But hope springs today for those who live with these copycats (whom they have not been able to eliminate)...
Google Goes To Washington
Google posted the other day about their testimony on combating copyright infringement. Kent Walker, Senior Vice President and General Counsel of Google, testified before the House Judiciary Subcommittee on Intellectual Property, Competition, and the Internet.
As Google noted on their blog, Mr. Walker shared "several ways Google combats infringement including our Content ID system on YouTube, our efforts to make copyright work better online, and our work to keep counterfeiters out of our ads system."
Of specific interest to us is their "efforts to make copyright work better online." Those efforts, per their post in December, 2010, include the promise to "act on reliable copyright takedown requests within 24 hours."
-----SIDEBAR-----
Interestingly, they also say that they will prevent commonly
abused terms from appearing in Google's autocomplete features
(Google Instant and Google Recommend).
This is something that Google has always said is "algorithmically
determined" and canNOT be changed. Again, how much can we trust
what Google says when they consistently contradict themselves
according to the situation?
-----SIDEBAR-----
You can read the full testimony by Google here.
Key excerpts...
"Google believes strongly in protecting copyright and other intellectual property rights."
"... we remove or disable access to millions of infringing materials each year at the request of copyright owners."
"Copyright owners in 2010 called on Google to disable access to approximately 3 million allegedly infringing materials across all our products, which accounts for far less than 1% of all the materials hosted and indexed by Google. We received takedown notices by letter, fax, email, and web forms from all sorts of copyright owners including movie studios, record labels, adult entertainment vendors, and needlepoint pattern publishers, from 70 countries and in a wide variety of languages.
We maintain a growing team of employees dedicated to receiving, reviewing, and responding to DMCA notices. We check to make sure that the notices are complete and are not attempts by competitors or others to use invalid copyright claims to censor speech with which they disagree."
Wonderful words. (The bolding is mine.) Humans review. And they promise to give results within 24 hours.
Despite the historical difference in what Google says and what Google does (especially as it impacts "the little guy"), let's give them the benefit of the doubt...
So What's the Next Step?
If you are a gray-zone victim of copyright infringement... put Google's public commitment to the test.
To be fair, "copyright infringement" does not mean that you may have noticed a page here or there on the Web that is similar to one of your pages. Nor can this mean that you have a suspicion of wrongdoing.
This is for you if you have a full-blown case of wide-scale infringement of your content that is still "live" on the Web... a "no doubt about it case" of significant scale.
Do one more thing. Verify that the site is in Google's index by searching for the following at Google...
site:offending-site.com (replace "offending-site.com" with the domain name of the offending site).
If many of the offending site's offending pages are in Google's index, proceed...
If that is you, head to Google's DMCA page. This landing page is an algorithm that sends you to the right form according to the answers you select. For most people who read this post...
If your written content is being violated by another site, you will likely end up at this form.
Complete the form completely and fairly. Don't exaggerate. Stick to the facts. Emotion does not matter. Cool professionalism does. Extensive, objective documentation is what makes the case. Details, details, details.
Do NOT expect Google to do the work for you. For example, if you just tell them to compare Page A (offending) vs Page B (yours), that is lazy -- heck, I would not do it if I was Google! Provide every detail possible to prove that Page A was taken from Page B.
And then do it for every other page. In other words...
Take your time and fill out that form to the very best of your abilities.
Save a copy as a PDF document for your records.
What Do You Do After That?
Wait 24 hours. Then give them another 24 hours leeway.
If you do not get a response within 48 hours, let Google know directly at this post by Google. Again, no emotion, blame, etc. Just the facts.
If you receive a request for more information, provide it. Then give another 48 hours.
If you receive a reply denying your request, and if no reason is given or if the reason is uselessly vague or incorrect, report it. (NOTE: This is why your case must be iron-clad. Any reasonable human who reviews your submission would agree that it violates your original content.)
Infringement of your intellectual property is, in Google's words, "an issue of critical importance to Google and the entire information economy."
Like any large company, what happens at the ground level may not match the stated policies of the top brass. I'm sure they'd want to know if their system is not working. So do let them know.
Oh Yes, One More Thing
We at SiteSell do want to know.
So please let us know that you made such a report to Google. Post in the comments below. Include the decision that Google took, if any. Include as much detail as you like in your post, but at a minimum include your domain name so we can contact you "off-blog" if need be.
(Hold onto your PDF file. We will ask for it. We will make a post about this if the volume justifies it, so will want to verify the quality of cases submitted.)
Reporting to both Google's post and this post will show if Google is...
a) publishing your notice to them in their blog
b) doing anything about it.
It will also show the size of this problem.
Ultimately, if the top Google brass is sincere, they will publish your reports. If they do not, or if they do not fix a discrepancy between the top brass words in Washington and their deeds at ground level, that will become obvious.
I am hoping that Google is sincere about protecting your intellectual property. If they are, it spells the end of the "gray zone," something that their Farmer algorithm is not doing.
High-value content sites will do better than ever. And that is WIN-WIN-WIN for searchers-publishers-Google!
It would, perhaps, mean the return of Google as "the good guy."
Please pass this on to relevant threads in blogs and forums.
Fingers crossed.
All the best,
![]()
P.S. Naturally, if there are sites out there that are flat-out copying your material word-for-word and if you have not been able to get it removed, please do report this. Normally, though, these do get dropped. The real problem is the "gray zone" because it takes more time and energy to evaluate.
P.P.S. I have not asked whether "gray zone" sites impact your traffic for two reasons...
1) most copycat sites fail dismally, making little or no impact on the original site. That is why my general advice is usually to first verify if they impact your success. Yes, take the basic steps to get rid of them. But I see too many people wasting too much time and money on copycats who will only die anyway, once they realize that their low-life activities don't "show them the money."
2) it is irrelevant. Copying content is flat-out wrong. For many, it's hard to ignore that gut-wrenching feeling even if the rip-off artist gets little or no traffic. Wrong is wrong and Google has the power to step up and make it right. I admire their words. I hope they back those up with actions.



