The Development of NLP Models

Authors

  • Keming Zhang

DOI:

https://doi.org/10.54691/enn7qh67

Keywords:

Natural Language Processing (NLP); Word Embeddings; Recurrent Neural Networks (RNN); Long Short-Term Memory (LSTM); Transformer; BERT; Large Language Models (LLMs); Fine-Tuning.

Abstract

This article reviews the development of Natural Language Processing (NLP) models from early statistical approaches to modern large language models (LLMs). Beginning with probability-based n-gram models, it outlines their limitations in data sparsity and long-term dependency. It then introduces neural network-based models, including word embeddings, logistic regression, multi-layer perceptrons, and Word2Vec, followed by sequential models such as RNNs and LSTMs that capture temporal dependencies. The shift to pre-trained models, marked by Word2Vec and the Transformer architecture, enabled scalable transfer learning and laid the foundation for state-of-the-art models like BERT, GPT, and their derivatives. Applications in text classification are illustrated through experiments with RoBERTa, hybrid BERT-LightGBM models, and fine-tuning techniques such as LoRA. Finally, the article discusses the broader ecosystem of large models, including chat models, multimodal models, and agent frameworks that integrate planning, memory, and tool use. The review emphasizes both theoretical principles and practical workflows, highlighting the necessity of iterative learning, coding practice, and ecosystem familiarity for effectively leveraging NLP technologies.

Downloads

Download data is not yet available.

References

[1] Wikipedia contributors. (2025). Word n-gram language model. In Wikipedia. Retrieved August 20, 2025, from https://en.wikipedia.org/wiki/Word_n-gram_language_model

[2] Supervised Machine Learning: Regression and Classification (DeepLearning.ai) [Online course]. Coursera. Retrieved June 20, 2025, from https://www.coursera.org/learn/machine-learning

[3] Introduction to Machine Learning (Duke University) [Online course]. Coursera. Retrieved June 27, 2025, from https://www.coursera.org/programs/coursera-for-university-of-rochester-mdtfv/learn/machine-learning-duke

[4] Baidu Wenku. (n.d.). Development of NLP models. Retrieved August 26, 2025, from https://wenku.baidu.com/view/d3760cc3de3383c4bb4cf7ec4afe04a1b071b0d3.html

[5] IBM. NLP vs. NLU vs. NLG: What’s the difference?. IBM Think. Retrieved August 28, 2025, from https://www.ibm.com/think/topics/nlp-vs-nlu-vs-nlg

[6] Bilibili. (2021, June 14). Introduction to Natural Language Processing (NLP) [Video]. Bilibili. https://www.bilibili.com/video/BV14V4y1o7ff/

[7] Bilibili. (2021, October 12). Basic NLP course [Video]. Bilibili. https://www.bilibili.com/video/BV1Di4y1c7Zm/

[8] Bilibili. (2022, March 18). NLP and Machine Learning Applications [Video]. Bilibili. https://www.bilibili.com/video/BV1yU4y1E7Ns/

[9] Alibaba Cloud. (n.d.). What is LLM (Large Language Model)?. Alibaba Cloud. Retrieved August 29, 2025, from https://cn.aliyun.com/getting-started/what-is/what-is-llm

Downloads

Published

22-09-2025

Issue

Section

Articles