N-GRAMS

Updated 196 days ago
  • ID: 7925827/126
The n-grams data shows the frequency of the most frequent 2, 3, 4, and 5-word strings from the 14 billion word iWeb corpus. If you choose the "wordID" format (right, below), you will have the top 100 million 2-grams (two word sequences), the top 100 3-grams, 100 million 4-grams, and 100 million 5-grams from the corpus. That's a total of 400 million rows of data. The other format is the "word format" (left, below), which gives you 50 million rows of data for each of the 2-grams, 3-grams, 4-grams, and 5-grams. If you purchase the data, you have access to both formats -- whichever meets your needs the best... 155 million n-grams. Only lists based on a large, recent, balanced corpora of English
  • 0
  • 0
Interest Score
1
HIT Score
0.00
Domain
ngrams.info

Actual
www.ngrams.info

IP
209.90.108.238

Status
OK

Category
Company, Other
0 comments Add a comment