Google has a tracking system called “Government Requests”. It records the scope and type of content countries submit to Google for removal from their services, in some sense, our internet (seemingly quite often too). A few days ago Google released data from January through June of this year. I commend Google for going to such lengths and having the shear courage in interest of transparency – you’re doing it right.
In many cases the requests are probably legitimate and imply something illegal has taken place, will, or the data is in violation of someones inalienable rights. However, I have a hard time believing *most* requests fall under that umbrella. Especially, when generalized reasons are common, such as “Other” or “Privacy and Security”.
The data isn’t user, company, entity, or content centric. Since it’s broken into 6 month chunks you can’t really glean a specific moment in time or incident. If not for that, it may help explain a few of these requests. Google doesn’t actually comply with all requests. I’m foggy on the specific details but from what I gather it needs to violate ToS or local law. Anyhow, Google knows what the content is and I’m sure they have an assortment of people critiquing the logic on whether to remove or not.
The variation of reasons, products, and countries are across the board. This represents 2 six month chunks or 1 years worth of data.
mysql> select reason, sum(num_items_requested) num_items_requested, count(distinct country) unique_countries, group_concat(distinct country_code order by country_code) countries from RequestByProductAndReason group by reason order by sum(num_items_requested) desc;
+----------------------+---------------------+------------------+-------------------------------------------------------+
| reason | num_items_requested | unique_countries | countries |
+----------------------+---------------------+------------------+-------------------------------------------------------+
| Other | 97153 | 17 | AR,AU,BR,CA,DE,ES,GB,IN,IT,JP,KR,LY,NO,RU,TR,TW,US |
| Privacy and Security | 33161 | 17 | AR,AU,BR,CA,DE,ES,FR,GB,IN,IT,JP,KR,LY,NL,PL,TR,US |
| Copyright | 11664 | 7 | BR,DE,GB,IT,TR,TW,US |
| Defamation | 6875 | 18 | AR,AU,BR,CA,CH,DE,ES,FR,GB,IN,IT,JP,KR,NL,PL,TR,TW,US |
| Government Criticism | 572 | 7 | BR,DE,IN,IT,TH,TR,US |
| Hate Speech | 508 | 8 | AU,BR,DE,FR,GB,IN,IT,US |
| Pornography | 187 | 9 | AU,BR,DE,ES,FR,IN,IT,NL,TR |
| National Security | 154 | 3 | GB,IN,US |
| Impersonation | 141 | 6 | BR,FR,IN,KR,TR,US |
| Violence | 65 | 5 | BR,DE,GB,IT,US |
| Electoral Law | 36 | 2 | BR,TW |
+----------------------+---------------------+------------------+-------------------------------------------------------+
mysql> select product, sum(num_items_requested) num_items_requested, count(distinct country) unique_countries, group_concat(distinct country_code order by country_code) countries from RequestByProductAndReason group by product order by sum(num_items_requested) desc;
+------------------------------------------+---------------------+------------------+-------------------------------------------------------+
| product | num_items_requested | unique_countries | countries |
+------------------------------------------+---------------------+------------------+-------------------------------------------------------+
| Google AdWords | 95862 | 10 | AR,BR,DE,ES,GB,KR,NO,TR,TW,US |
| Web Search | 36373 | 17 | AR,AU,BR,CA,CH,DE,ES,FR,GB,IN,IT,JP,KR,PL,RU,TR,US |
| Picasa Web Albums | 11585 | 4 | BR,ES,IN,US |
| YouTube | 3205 | 18 | AU,BR,CA,DE,ES,FR,GB,IN,IT,JP,KR,LY,NL,PL,TH,TR,TW,US |
| Google Groups | 1671 | 4 | BR,FR,IT,US |
| orkut | 1036 | 3 | BR,IN,KR |
| Blogger | 495 | 16 | AR,BR,CA,CH,DE,ES,FR,GB,IN,IT,KR,NL,PL,TR,TW,US |
| Google Images | 160 | 6 | AR,BR,DE,GB,IN,US |
| Gmail | 61 | 8 | BR,CA,DE,ES,FR,IT,TW,US |
| Google Videos | 22 | 3 | GB,IT,US |
| Android Market | 10 | 1 | KR |
| Google Earth, Google Maps, and Panoramio | 9 | 6 | DE,FR,GB,IN,TR,US |
| Google Places | 8 | 1 | BR |
| Google Docs | 5 | 1 | FR |
| Google Profiles | 4 | 2 | IN,KR |
| Google Sites | 3 | 3 | BR,DE,TW |
| Street View | 3 | 2 | DE,GB |
| Google App Engine | 1 | 1 | KR |
| Google Books | 1 | 1 | US |
| Textcube | 1 | 1 | KR |
| Web Search: Autocomplete | 1 | 1 | IT |
+------------------------------------------+---------------------+------------------+-------------------------------------------------------+
mysql> select country, sum(num_items_requested) num_items_requested from RequestByProductAndReason group by country order by sum(num_items_requested) desc;
+----------------+---------------------+
| country | num_items_requested |
+----------------+---------------------+
| United Kingdom | 93851 |
| South Korea | 32798 |
| Brazil | 13052 |
| Germany | 4337 |
| United States | 2178 |
| Norway | 1814 |
| India | 640 |
| France | 366 |
| Turkey | 288 |
| Thailand | 268 |
| Italy | 211 |
| Libya | 203 |
| Argentina | 132 |
| Taiwan | 115 |
| Poland | 72 |
| Spain | 63 |
| Japan | 38 |
| Canada | 36 |
| Switzerland | 18 |
| Netherlands | 16 |
| Australia | 10 |
| Russia | 10 |
+----------------+---------------------+
I was particularly interested …
- with the “Government Criticism” -> “Youtube” request by United States.
- with the “Privacy and Security” -> “Google Profile” request by South Korea.
- with the “Other” -> “Android Market” request by South Korea.
- why Norway and Russia always use the reason “Other”.
- why Thailand always uses the reason or seeks out “Government Criticism”.
- why Thailand and Libya only seek out “Youtube” content.
Moving onto a similar data set. This shows the percentage of actual data fully or partially removed for each request. This includes roughly 2 years worth of data.
The following result set shows the success rate average for each request for a given 6 month chunk. In the last year or so, of all the items requested to be removed, about 65% percent were fully or partially removed. If you take out the outliers hovering at or below 10 percent success rate you can see the numbers are much higher.
mysql> select date_ending, avg(percent_removal_complied) avg_percent_removed from RemovalRequests group by date_ending order by date_ending desc;
+-------------+---------------------+
| date_ending | avg_percent_removed |
+-------------+---------------------+
| 2011-06-30 | 64.1667 |
| 2010-12-31 | 67.3235 |
| 2010-06-30 | 68.2500 |
| 2009-12-31 | 52.3721 |
+-------------+---------------------+
mysql> select date_ending, avg(percent_removal_complied) avg_percent_removed from RemovalRequests where percent_removal_complied>10 group by date_ending order by date_ending desc;
+-------------+---------------------+
| date_ending | avg_percent_removed |
+-------------+---------------------+
| 2011-06-30 | 77.0000 |
| 2010-12-31 | 84.7778 |
| 2010-06-30 | 87.7500 |
| 2009-12-31 | 72.6452 |
+-------------+---------------------+
This last table shows countries with less than a 25% removal success rate. Which, in theory, would imply they need to work on their censorship logic. Though admittedly, Google doesn’t indicate the number if it’s less than 10 making it rather difficult, inconclusive at best, to say if they have bad censorship logic or if they had the misfortune of being rejected a couple times.
mysql> select date_ending, country, if(num_removal_requests=10,'<10',num_removal_requests) num_removal_requests, if(num_items_requested=10,'<10',num_items_requested) num_items_requested, percent_removal_complied from RemovalRequests where percent_removal_complied<=25 and num_removal_requests>1 order by date_ending desc;
+-------------+-------------------+----------------------+---------------------+--------------------------+
| date_ending | country | num_removal_requests | num_items_requested | percent_removal_complied |
+-------------+-------------------+----------------------+---------------------+--------------------------+
| 2011-06-30 | Taiwan | 69 | 115 | 12 |
| 2011-06-30 | Colombia | <10 | <10 | 0 |
| 2011-06-30 | Indonesia | <10 | <10 | 0 |
| 2011-06-30 | Ireland | <10 | <10 | 0 |
| 2011-06-30 | Israel | <10 | <10 | 25 |
| 2011-06-30 | Libya | <10 | <10 | 0 |
| 2011-06-30 | Malaysia | <10 | <10 | 0 |
| 2011-06-30 | Pakistan | <10 | <10 | 0 |
| 2010-12-31 | Belgium | <10 | <10 | 0 |
| 2010-12-31 | India | 67 | 282 | 22 |
| 2010-12-31 | Malta | <10 | <10 | 0 |
| 2010-12-31 | Mexico | <10 | <10 | 0 |
| 2010-12-31 | Norway | <10 | <10 | 0 |
| 2010-12-31 | Pakistan | <10 | <10 | 0 |
| 2010-12-31 | Singapore | <10 | <10 | 0 |
| 2010-12-31 | Taiwan | <10 | <10 | 25 |
| 2010-12-31 | Vietnam | <10 | <10 | 0 |
| 2010-06-30 | Taiwan | 11 | 12 | 0 |
| 2010-06-30 | Sweden | <10 | <10 | 0 |
| 2010-06-30 | Cyprus | <10 | <10 | 0 |
| 2010-06-30 | Kazakhstan | <10 | <10 | 0 |
| 2010-06-30 | Macedonia [FYROM] | <10 | <10 | 0 |
| 2010-06-30 | Mexico | <10 | <10 | 0 |
| 2010-06-30 | Russia | <10 | <10 | 0 |
| 2009-12-31 | Belgium | <10 | 0 | 0 |
| 2009-12-31 | Colombia | <10 | 0 | 0 |
| 2009-12-31 | Israel | <10 | 0 | 20 |
| 2009-12-31 | Cambodia | <10 | 0 | 0 |
| 2009-12-31 | Lithuania | <10 | 0 | 0 |
| 2009-12-31 | Macedonia [FYROM] | <10 | 0 | 0 |
| 2009-12-31 | Malaysia | <10 | 0 | 0 |
| 2009-12-31 | New Zealand | <10 | 0 | 0 |
| 2009-12-31 | Peru | <10 | 0 | 0 |
| 2009-12-31 | Pakistan | <10 | 0 | 0 |
| 2009-12-31 | Sweden | <10 | 0 | 0 |
| 2009-12-31 | Armenia | <10 | 0 | 0 |
+-------------+-------------------+----------------------+---------------------+--------------------------+

This information definitely sparks thought. Arguably, although there are plenty of organizations, working groups, etc. influenced by the government, companies still own the internet. What boat we'll be in 10-25 years from now is anyone's guess. It just goes to show that company and government policy is extremely important and it's something that cannot be taken lightly.
I would encourage everyone to take a peak at the following links and data for additional context.
Here's the Google blog entry.
Here's the transparency report homepage.
Here's the full breakdown of the United States (as seen in the above screen shot).