For the benefit of the wider web community, this page gives details of inappropriate access by robots, scripts, and other forms of suspicious access detected at the following websites: www.lemoda.net, www.sljfaq.org, and kanji.sljfaq.org.
|Robot name||Site accessed||Log file||Form of bad access|
|Linguee Bot (http://www.linguee.com/bot; email@example.com)||www.lemoda.net||linguee-bot-access.txt||
Ignored or did not read
This robot did not look at
Ignored or did not read
This robot identified itself as an Internet Explorer browser in its user agent field and ignored (did not attempt to read) robots.txt. It requests pages without any time delay between requests, and does not request compressed content, resulting in a larger-than-necessary use of bandwidth.
|Multiple Korean-language Android phones||kanji.sljfaq.org||kanji-logs-memory-cgi-ko-kr-unique-ip.txt||Bandwidth drain||
A series of IP addresses, all with user agent strings of the form
Mozilla/5.0 (Linux; U; Android 2.2; ko-kr; SHW-M180L Build/FROYO) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1, apparently Korean-language Android mobile phones, attempted to download PNG images from kanji.sljfaq.org/kanjivg/. A total of 20738 accesses from 4814 unique IP addresses were made from 5th January to 14th February 2011. This continued even when an empty response or redirect response was sent, suggesting that this was an attempt to hog bandwidth rather than to obtain a collection of images.
The attached extract from the Apache log file contains one line for each unique IP address, but omits repeated addresses.
|18.104.22.168||www.lemoda.net||22.214.171.124.txt||Search for vulnerabilities||
This robot looks for badly-programmed PHP files (there are none on this site).
|Trend Micro (126.96.36.199, 188.8.131.52, 184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124)||www.lemoda.net||trend-micro-attacks.txt||Bandwidth drain||
A robot misidentifying itself as Internet Explorer version 6.0,
from multiple IP addresses belonging to Trend Micro, repeatedly
downloads the same pages. It ignores
|MEDIA RAG CORPORATION (126.96.36.199)||www.sljfaq.org||188.8.131.52.txt||Bandwidth drain||
A site sent a series of repeated automated requests to English to katakana converter.
The "Yeti" robot with a user agent string
Yeti/1.0 (NHN Corp.; http://help.naver.com/robots/)is a badly-programmed robot. It fires off multiple requests for pages at an alarming rate. Unlike almost every other web robot, the incompetently programmed Yeti robot does not use any gzip compression, thus resulting in a huge waste of bandwidth as it downloads pages in uncompressed format. The page contains a URL with help but the page is only provided in Korean, even though the crawler downloads pages in English.