Database Contents

The data in the searchable databases in the ngramfinder is all from the Google Books 2012 database.

English 5-grams: Contains the top 2 million 5-grams from the 'literature' database

English all: Contains 12 million ngrams.
-the top 1 million 1-grams
-the top 2 million 2-grams
-the top 5 million 3-grams
-the top 2 million 4-grams
-the top 2 million 5-grams

All English data is from the English-language literature database, which consists of both UK and US literature.

French 5-grams: Contains the top 7 million 5-grams from the general French database

French all: Contains 14 million ngrams.
-the top 1 million 1-grams
-the top 2 million 2-grams
-the top 5 million 3-grams
-the top 3 million 4-grams
-the top 3 million 5-grams

German 5-grams: Contains the top 6 million 5-grams from the general German database

German all: Contains 6 million ngrams.
-the top 1 million 1-grams
-the top 1 million 2-grams
-no 3-grams so far
-the top 2 million 4-grams
-the top 2 million 5-grams

Spanish 5-grams: Contains the top 5 million 5-grams from the general Spanish database

Spanish all: Contains 9 million ngrams.
-the top 1 million 1-grams
-the top 2 million 2-grams
-no 3-grams so far
-the top 3 million 4-grams
-the top 3 million 5-grams

Italian 5-grams: Contains the top 5 million 5-grams from the general Italian database

Italian all: Contains 9 million ngrams.
-the top 1 million 1-grams
-the top 2 million 2-grams
-no 3-grams so far
-the top 3 million 4-grams
-the top 3 million 5-grams