: Based on a balanced corpus of written and spoken Italian. It focuses heavily on statistical representation across different genres of literature and speech. Pros :
If only surface forms are present and lemmas are needed, run a lemmatizer (e.g., SpaCy’s Italian model or Lemmy) and aggregate counts by lemma. Italian Frequency Dictionary Pdf