{"id":2324,"date":"2011-07-22T14:47:45","date_gmt":"2011-07-22T21:47:45","guid":{"rendered":"http:\/\/www.pdxtc.com\/wpblog\/?p=2324"},"modified":"2019-10-24T07:21:03","modified_gmt":"2019-10-24T14:21:03","slug":"google-loves-a-good-scrape","status":"publish","type":"post","link":"https:\/\/www.pdxtc.com\/wpblog\/google\/google-loves-a-good-scrape\/","title":{"rendered":"Google Loves a Good Scrape"},"content":{"rendered":"<p>If you look at the Wikipedia definition of a scraper website, it says<\/p>\n<blockquote><p>&#8220;A scraper site is a spam website that copies all of its content from other websites&#8221;.<\/p><\/blockquote>\n<p>Well Google has a new project, that in my opinion, is basically just a well done Google scraper.<\/p>\n<p>Over the past dozen years or so, the advent of RSS feeds and other automated content distribution technology has led to the development of hundreds of scripts, software programs, products and plugins, that all seek to locate, and regenerate content, in order to blog, post, or otherwise auto-magically display content that comes from other sources.<\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" style=\"border-style: initial; border-color: initial; border-width: 0px;\" src=\"http:\/\/imgpics.s3.amazonaws.com\/bcf79f7c118c4ec584ac0d6b04ce5eb1.jpg\" alt=\"image\" width=\"321\" height=\"200\" border=\"0\" \/><\/p>\n<p>I&#8217;ve always maintained (granted, it&#8217;s often been only to myself while submitting a reinclusion request) that a well organized scraper site CAN have some actual value.<\/p>\n<p>I&#8217;ve also long believed that augmenting the original content with &#8220;fed&#8221; or &#8220;scraped&#8221; content on the same page can have a positive impact on your search rankings.<\/p>\n<p>Google on the other hand, has long made it their mission to put the words &#8220;scraped content&#8221; into the same category as &#8220;paid links&#8221; (i.e. Evil) , and they have always publicly discouraged the practice, claiming to be trying to <a href=\"http:\/\/www.google.com\/support\/forum\/p\/Webmasters\/thread?tid=75d397f7ef8e1d19&amp;hl=en\" target=\"_blank\" rel=\"noopener noreferrer\">get rid of scraped pages<\/a> from the index even if they&#8217;ve <a href=\"http:\/\/www.shoemoney.com\/2009\/03\/03\/why-does-adsense-love-spam-scraper-sites\/\" target=\"_blank\" rel=\"noopener noreferrer\">rewarded it behind the scenes<\/a> with rankings and big Adsense checks.<\/p>\n<p>Last week I found out about a new Google project, and when I saw it I was stunned &#8211; I think it&#8217;s nothing but a scraper engine!<\/p>\n<p>On the one hand, it is fun to play with and I do see the value in it. On the other hand, it seems sort of hypocritical. \u00a0Why is it okay for them to do it, but if I want to do it on my site I&#8217;m a &#8220;bad guy&#8221;?<\/p>\n<p>Here&#8217;s a quick video below, and although the site is now shut down, you can read about what was called <a href=\"https:\/\/www.dailywireless.org\/internet\/a-look-back-at-google-what-do-you-love-also-known-as-wdyl\/)\" target=\"_blank\" rel=\"noopener noreferrer\">Google WYDL here<\/a>.<\/p>\n<div align=\"center\">\n<p><iframe loading=\"lazy\" src=\"http:\/\/www.youtube.com\/embed\/i26TveAvYfQ?rel=0\" frameborder=\"0\" width=\"449\" height=\"337\"><\/iframe><\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>If you look at the Wikipedia definition of a scraper website, it says &#8220;A scraper site is a spam website that copies all of its content from other websites&#8221;. Well Google has a new project, that in my opinion, is basically just a well done Google scraper. Over the past dozen years or so, the [&hellip;]<\/p>\n","protected":false},"author":76,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_et_pb_use_builder":"","_et_pb_old_content":"","_et_gb_content_width":"","footnotes":""},"categories":[26],"tags":[],"class_list":["post-2324","post","type-post","status-publish","format-standard","hentry","category-google"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v28.1 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Google Loves a Good Scrape<\/title>\n<meta name=\"description\" content=\"If you look at the Wikipedia definition of a scraper website, it says &quot;A scraper site is a spam website that copies all of its content from other\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.pdxtc.com\/wpblog\/google\/google-loves-a-good-scrape\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Google Loves a Good Scrape\" \/>\n<meta property=\"og:description\" content=\"If you look at the Wikipedia definition of a scraper website, it says &quot;A scraper site is a spam website that copies all of its content from other\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.pdxtc.com\/wpblog\/google\/google-loves-a-good-scrape\/\" \/>\n<meta property=\"og:site_name\" content=\"Scott Hendison&#039;s Old Search Commander Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/SearchCommander\" \/>\n<meta property=\"article:published_time\" content=\"2011-07-22T21:47:45+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2019-10-24T14:21:03+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/imgpics.s3.amazonaws.com\/bcf79f7c118c4ec584ac0d6b04ce5eb1.jpg\" \/>\n<meta name=\"author\" content=\"Scott\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@shendison\" \/>\n<meta name=\"twitter:site\" content=\"@shendison\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Scott\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/google\\\/google-loves-a-good-scrape\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/google\\\/google-loves-a-good-scrape\\\/\"},\"author\":{\"name\":\"Scott\",\"@id\":\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/#\\\/schema\\\/person\\\/3142c7d28dc676725ac62cd6c9de8371\"},\"headline\":\"Google Loves a Good Scrape\",\"datePublished\":\"2011-07-22T21:47:45+00:00\",\"dateModified\":\"2019-10-24T14:21:03+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/google\\\/google-loves-a-good-scrape\\\/\"},\"wordCount\":322,\"commentCount\":0,\"image\":{\"@id\":\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/google\\\/google-loves-a-good-scrape\\\/#primaryimage\"},\"thumbnailUrl\":\"http:\\\/\\\/imgpics.s3.amazonaws.com\\\/bcf79f7c118c4ec584ac0d6b04ce5eb1.jpg\",\"articleSection\":[\"Google\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/google\\\/google-loves-a-good-scrape\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/google\\\/google-loves-a-good-scrape\\\/\",\"url\":\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/google\\\/google-loves-a-good-scrape\\\/\",\"name\":\"Google Loves a Good Scrape\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/google\\\/google-loves-a-good-scrape\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/google\\\/google-loves-a-good-scrape\\\/#primaryimage\"},\"thumbnailUrl\":\"http:\\\/\\\/imgpics.s3.amazonaws.com\\\/bcf79f7c118c4ec584ac0d6b04ce5eb1.jpg\",\"datePublished\":\"2011-07-22T21:47:45+00:00\",\"dateModified\":\"2019-10-24T14:21:03+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/#\\\/schema\\\/person\\\/3142c7d28dc676725ac62cd6c9de8371\"},\"description\":\"If you look at the Wikipedia definition of a scraper website, it says \\\"A scraper site is a spam website that copies all of its content from other\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/google\\\/google-loves-a-good-scrape\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/google\\\/google-loves-a-good-scrape\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/google\\\/google-loves-a-good-scrape\\\/#primaryimage\",\"url\":\"http:\\\/\\\/imgpics.s3.amazonaws.com\\\/bcf79f7c118c4ec584ac0d6b04ce5eb1.jpg\",\"contentUrl\":\"http:\\\/\\\/imgpics.s3.amazonaws.com\\\/bcf79f7c118c4ec584ac0d6b04ce5eb1.jpg\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/google\\\/google-loves-a-good-scrape\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Google Loves a Good Scrape\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/#website\",\"url\":\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/\",\"name\":\"Scott Hendison&#039;s Old Search Commander Blog\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.pdxtc.com\\\/wpblog\\\/#\\\/schema\\\/person\\\/3142c7d28dc676725ac62cd6c9de8371\",\"name\":\"Scott\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/ba275e23c0aad37526141e715b54cd3eeac27b071e4395b2b39e801ca68355d6?s=96&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/ba275e23c0aad37526141e715b54cd3eeac27b071e4395b2b39e801ca68355d6?s=96&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/ba275e23c0aad37526141e715b54cd3eeac27b071e4395b2b39e801ca68355d6?s=96&r=g\",\"caption\":\"Scott\"},\"sameAs\":[\"https:\\\/\\\/x.com\\\/shendison\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Google Loves a Good Scrape","description":"If you look at the Wikipedia definition of a scraper website, it says \"A scraper site is a spam website that copies all of its content from other","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.pdxtc.com\/wpblog\/google\/google-loves-a-good-scrape\/","og_locale":"en_US","og_type":"article","og_title":"Google Loves a Good Scrape","og_description":"If you look at the Wikipedia definition of a scraper website, it says \"A scraper site is a spam website that copies all of its content from other","og_url":"https:\/\/www.pdxtc.com\/wpblog\/google\/google-loves-a-good-scrape\/","og_site_name":"Scott Hendison&#039;s Old Search Commander Blog","article_publisher":"https:\/\/www.facebook.com\/SearchCommander","article_published_time":"2011-07-22T21:47:45+00:00","article_modified_time":"2019-10-24T14:21:03+00:00","og_image":[{"url":"http:\/\/imgpics.s3.amazonaws.com\/bcf79f7c118c4ec584ac0d6b04ce5eb1.jpg","type":"","width":"","height":""}],"author":"Scott","twitter_card":"summary_large_image","twitter_creator":"@shendison","twitter_site":"@shendison","twitter_misc":{"Written by":"Scott","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.pdxtc.com\/wpblog\/google\/google-loves-a-good-scrape\/#article","isPartOf":{"@id":"https:\/\/www.pdxtc.com\/wpblog\/google\/google-loves-a-good-scrape\/"},"author":{"name":"Scott","@id":"https:\/\/www.pdxtc.com\/wpblog\/#\/schema\/person\/3142c7d28dc676725ac62cd6c9de8371"},"headline":"Google Loves a Good Scrape","datePublished":"2011-07-22T21:47:45+00:00","dateModified":"2019-10-24T14:21:03+00:00","mainEntityOfPage":{"@id":"https:\/\/www.pdxtc.com\/wpblog\/google\/google-loves-a-good-scrape\/"},"wordCount":322,"commentCount":0,"image":{"@id":"https:\/\/www.pdxtc.com\/wpblog\/google\/google-loves-a-good-scrape\/#primaryimage"},"thumbnailUrl":"http:\/\/imgpics.s3.amazonaws.com\/bcf79f7c118c4ec584ac0d6b04ce5eb1.jpg","articleSection":["Google"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.pdxtc.com\/wpblog\/google\/google-loves-a-good-scrape\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.pdxtc.com\/wpblog\/google\/google-loves-a-good-scrape\/","url":"https:\/\/www.pdxtc.com\/wpblog\/google\/google-loves-a-good-scrape\/","name":"Google Loves a Good Scrape","isPartOf":{"@id":"https:\/\/www.pdxtc.com\/wpblog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.pdxtc.com\/wpblog\/google\/google-loves-a-good-scrape\/#primaryimage"},"image":{"@id":"https:\/\/www.pdxtc.com\/wpblog\/google\/google-loves-a-good-scrape\/#primaryimage"},"thumbnailUrl":"http:\/\/imgpics.s3.amazonaws.com\/bcf79f7c118c4ec584ac0d6b04ce5eb1.jpg","datePublished":"2011-07-22T21:47:45+00:00","dateModified":"2019-10-24T14:21:03+00:00","author":{"@id":"https:\/\/www.pdxtc.com\/wpblog\/#\/schema\/person\/3142c7d28dc676725ac62cd6c9de8371"},"description":"If you look at the Wikipedia definition of a scraper website, it says \"A scraper site is a spam website that copies all of its content from other","breadcrumb":{"@id":"https:\/\/www.pdxtc.com\/wpblog\/google\/google-loves-a-good-scrape\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.pdxtc.com\/wpblog\/google\/google-loves-a-good-scrape\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.pdxtc.com\/wpblog\/google\/google-loves-a-good-scrape\/#primaryimage","url":"http:\/\/imgpics.s3.amazonaws.com\/bcf79f7c118c4ec584ac0d6b04ce5eb1.jpg","contentUrl":"http:\/\/imgpics.s3.amazonaws.com\/bcf79f7c118c4ec584ac0d6b04ce5eb1.jpg"},{"@type":"BreadcrumbList","@id":"https:\/\/www.pdxtc.com\/wpblog\/google\/google-loves-a-good-scrape\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.pdxtc.com\/wpblog\/"},{"@type":"ListItem","position":2,"name":"Google Loves a Good Scrape"}]},{"@type":"WebSite","@id":"https:\/\/www.pdxtc.com\/wpblog\/#website","url":"https:\/\/www.pdxtc.com\/wpblog\/","name":"Scott Hendison&#039;s Old Search Commander Blog","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.pdxtc.com\/wpblog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.pdxtc.com\/wpblog\/#\/schema\/person\/3142c7d28dc676725ac62cd6c9de8371","name":"Scott","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/ba275e23c0aad37526141e715b54cd3eeac27b071e4395b2b39e801ca68355d6?s=96&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/ba275e23c0aad37526141e715b54cd3eeac27b071e4395b2b39e801ca68355d6?s=96&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/ba275e23c0aad37526141e715b54cd3eeac27b071e4395b2b39e801ca68355d6?s=96&r=g","caption":"Scott"},"sameAs":["https:\/\/x.com\/shendison"]}]}},"_links":{"self":[{"href":"https:\/\/www.pdxtc.com\/wpblog\/wp-json\/wp\/v2\/posts\/2324","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.pdxtc.com\/wpblog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.pdxtc.com\/wpblog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.pdxtc.com\/wpblog\/wp-json\/wp\/v2\/users\/76"}],"replies":[{"embeddable":true,"href":"https:\/\/www.pdxtc.com\/wpblog\/wp-json\/wp\/v2\/comments?post=2324"}],"version-history":[{"count":3,"href":"https:\/\/www.pdxtc.com\/wpblog\/wp-json\/wp\/v2\/posts\/2324\/revisions"}],"predecessor-version":[{"id":4401,"href":"https:\/\/www.pdxtc.com\/wpblog\/wp-json\/wp\/v2\/posts\/2324\/revisions\/4401"}],"wp:attachment":[{"href":"https:\/\/www.pdxtc.com\/wpblog\/wp-json\/wp\/v2\/media?parent=2324"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.pdxtc.com\/wpblog\/wp-json\/wp\/v2\/categories?post=2324"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.pdxtc.com\/wpblog\/wp-json\/wp\/v2\/tags?post=2324"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}