Data Sources

JMdict, JMnedict, and KANJIDIC2

Japanese word, name, and kanji entries come from the JMdict, JMnedict, and KANJIDIC2 dictionary files. These files are the property of the Electronic Dictionary Research and Development Group, and are used in conformance with the Group's licence.

CC-CEDICT

Chinese word entries come from the CC-CEDICT dictionary file. The file is published under the CC BY-SA 3.0 DEED license.

Unihan

Chinese character entries come from the Unihan database. The database is published under the Unicode Data Files and Software License.

Tatoeba

This site uses the collection of sentences provided by Tatoeba. The data is available for download at this site.

The sentences are published under a Creative Commons Attribution 2.0 France (CC BY 2.0 FR) license. Some of the sentences provided at Tatoeba are also available under a CC0 1.0 Universal licence.

JLPT levels

The site uses the JLPT data provided by Jonathan Waller on his JLPT resources site. The data is provided under a Creative Commons "BY" licence. His page about data sharing is here.

KanjiVG

The visualizations of the stroke order on the individual kanji pages are created using the data from the KanjiVG project. KanjiVG is released under the Creative Commons Attribution-Share Alike 3.0 license.

Kanji radicals

The information about kanji radicals is taken from this Wikipedia page: List of kanji radicals by stroke count.

Wiktionary

Some of the dictionary entries contain Wiktionary data as well.

HSK

The HSK 3.0 data comes from the official PDF file which was OCRed and provided as a plain text file here.