Scraping site

  • Hey Luminaria. No problem. I’m confused too. I just moved my blog on to a new host and wonder if that has something to do with it? If this splogger is redirecting people to my site he must be one of a tiny handful of people who read and should be very easy to find. ;)

  • I’ve noticed that whenever I make a post about Ron Paul or something that relates to him and tag my post; Ron Paul. This random third party website grabs it, makes a copy and sticks it on a site with a bunch of other people’s articles. It’s rather unsettling.

    * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
    http://will86aber.wordpress.com/2008/07/07/the-disaster-that-is-penndel-borough-officials/

  • Hi. I’ve received a response from Google AdSense. Here it is below:

    Hello,

    Thank you for your note. It is our policy to respond to notices of alleged
    infringement that comply with the Digital Millennium Copyright Act (the
    text of which can be found at the U.S. Copyright Office website:
    http://www.copyright.gov/) and other applicable intellectual property
    laws. In this case, this means that if we receive proper notice of
    infringement, we will forward that notice to the responsible web site
    publisher.

    To file a notice of infringement with us, you must provide a written
    communication (by fax or regular mail, not by email) that sets forth the
    items specified below. Please note that pursuant to that Act, you may be
    liable to the alleged infringer for damages (including costs and
    attorneys’ fees) if you materially misrepresent that you own an item when
    you in fact do not. Accordingly, if you are not sure whether you have the
    right to request removal from our service, we suggest that you first
    contact an attorney.

    To expedite our ability to process your request, please use the following
    format (including section numbers):

    1. Identify in sufficient detail the copyrighted work that you believe has
    been infringed upon. For example, “The copyrighted work at issue is the
    text that appears on http://www.legal.com/legal_page.html.”

    2. Identify the material that you claim is infringing upon the copyrighted
    work listed in item #1 above. You must identify each page that allegedly
    contains infringing material by providing its URL.

    3. Provide information reasonably sufficient to permit Google to contact
    you (email address is preferred).

    4. Include the following statement: “I have a good faith belief that use
    of the copyrighted materials described above on the allegedly infringing
    webpages is not authorized by the copyright owner, its agent, or the law.”

    5. Include the following statement: “I swear, under penalty of perjury,
    that the information in the notification is accurate and that I am the
    copyright owner or am authorized to act on behalf of the owner of an
    exclusive right that is allegedly infringed.”

    6. Sign the paper.

    7. Send the written communication to the following address:

    Google, Inc.
    Attn: AdSense Support, DMCA complaints
    1600 Amphitheatre Parkway
    Mountain View CA 94043

    OR Fax to:

    (650) 618-8507, Attn: AdSense Support, DMCA complaints

    Regards,

    The Google AdSense Team

  • Yet still no-one addresses the practical issues: in order to prove one’s original work has been copied elsewhere, one has to visit sites where one runs the risk of picking up malware or getting stuck in looped cycles.

    Imagine you’ve just googled a distinctively original phrase from your own blog and you find it occurs on a number of (non-WP) sites – how do you check out such sites safely?

  • It may take a bit of tech skills to do it safely. You can use a VM (virtual machine). I just saw “Steady State” on LifeHacker (http://lifehacker.com/397786/kid+proof-your-pc-with-steadystate). Or you can use a Linux Live CD.

    I leave it to you to Google to find the details.

  • katm – I feared it might be the kind of operation beyond most of us – I’m sure I’m not atypical in being as careful as I can be, with the basic prophylactics, but I certainly lack the technical skills and confidence to do as you suggest (and thanks for the suggestion, btw)….effectively, >90% of us (at a rough guess) are unable to enforce our copyright – I’ve just stopped worrying about it, frankly. If I had to learn the relevant tech skills & keep them polished & up to date, I’d probably never have the time to write another blog entry…. :-)

  • doggerelist: you could always check them out from a public computer that you KNOW has up-to-date malware protection. Ask at the library. Then use it at the end of the day, because the virus scanners, etc are usually run after closing. Some have very good protection, some just adequate protection, some less, and you don’t want to infect a vulnerable computer. Just say you’ll be visiting new sites and you want to make sure your disk won’t get anything nasty on it, so are they protected. I word with public computers a lot, and some places will be very helpful. Everyone loves killing blogscrapers.

  • That’s a good suggestion, raincoaster….there’s no easy solution to dealing with these vermin, and every small step helps….

  • my.zestead.com

    Making a DMCA complaint is a little bit of work but basically consists of filling in the blanks. Our experience is that they are often successful.

  • I don’t mean to beat a dead horse, but I just noticed that http://my.zestead.com/austindwilawyer/ sucked up my entire post today literally instantaneously after I posted it. That means he’s sucking our posts via RSS, right? Apparently the partial post setting for RSS is completely useless when discouraging these splog bots. I haven’t had a chance to write the DMCA complaint letter yet. I wish there was a way to BLOCK HIS DOMAIN!

  • I have one extra item – In my Incoming Links in my Dashboard – I have the following:
    Austin Dwi Lawyer linked here saying, …

    And under Blog Stats, I have an Incoming Link there and when I click on it – it takes me to the austindwilawyer site.

  • Folks, please do this – the more valid complaints they receive, the more likely it will stop.

    If you’ve already complained and they’ve copied a new post, send a new complaint for that post.

  • Another one bites the dust! I just clicked on an old my.zestead.com/austindwilawyer link in my incoming links site and saw that it’s been removed! Yeah! Another down, a trillion to go.

  • There are many ways to fight sploggers. You have to be creative. I forgot to mention that yesterday I did a search for Austin DWI Lawyer and found a legitimate site. I left a non-threatening comment on it. Also, I’ve been prefacing each of my posts with this handy notice (thanks to the Exposing Sploggers site):

    -Copyright © 2008 Exposing Sploggers. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator, or posted on my.zestead.com/austindwilawyer, the site you are looking at is guilty of copyright infringement. Please contact (email visible only to moderators and staff) so we can take legal action immediately.

    I plan to substitute subsequent violators URL. It’s kind of annoying to do this, but I think it has some effect.

  • so if i click on “remove my site” i’ve nowmade myself MORE vulnerable?

    SHIT! What to do now?

  • Where did you click on “remove my site?” Some of those are legit, some are not.

    What can you do now? Run a virus/spyware scan and move on.

  • The Austin DWI lawyer is finally gone from the links on my dashboard. Hurray! But there’s a new one, http://www.scienceguide.net/, that copied an entire post of mine from a couple of days ago. It looks like there is nothing on the site except posts from other blogs (mostly WordPress). All the posts there are “posted by admin” and include a tiny link at the end back to the original post. No ads on the site yet. What kind of deal is this?

  • It’s a blogscraper. Usually they’re in it for ad revenue, but not always; and they don’t always get the ads up right away. Follow the link tellyworth provided above and good luck with it!

  • Here’s my problem: a blogger, hosted by WordPress.com, is stealing entire articles from Ezine and posting them to his WordPress blog, with no resource box and no links. I have emailed Matt Mullenweg several times, as he is the whois contact for WordPress.com, but have not received an acknowledgement.

    WordPress.com will not accept my question from the support page, because I am not hosted by WordPress.com. This guy is violating copyright law, AND WordPress.com’s own terms of service, but I cannot find anyplace on WordPress.com for reporting violations of terms of service.

    Squidoo shut down a plagiarist in less than 12 hours after I reported him. Google even responded to my report in about 48 hours. WordPress is maddeningly (is that a word?) silent. Have any of you had any luck in getting attention from WordPress regarding this issue when the violator is hosted here?

    Thanks for listening to my rant.
    Tejasca

  • Emailing Matt is not going to be effective.

    Since you have a WP.com ID, you have two options:

    1) http://wordpress.com/complaints/
    2) when logged in, go to that blog. Copy the URL of a post it stole from you, then go to your grey admin bar and under Blog Info, hit Report As Spam. Put the info in there, including that link and the link to your original article. That gets staff attention right away.

  • The topic ‘Scraping site’ is closed to new replies.