Series: Natural Language Processing for All
When
Prepare your text data for advanced analysis with our primer on text pre-processing for Natural Language Processing. Text pre-processing is a crucial step in any NLP pipeline, ensuring that your data is clean, normalized, and ready for modeling. This workshop will introduce pre-processing techniques for text data from sources such as web scraping and online datasets. We will take a look at tools available for categorising, organizing and tagging our text.
With a practical demonstration, we will explore handling various text formats, dealing with noise, and transforming text into a format suitable for machine learning algorithms. Whether you are interested in an NLP task or just making sense of a data dump, join us for this session on the tools and knowledge to optimize your text data effectively!
Join us for an engaging and accessible introduction to Natural Language Processing (NLP) and its practical applications for everyday tasks! In "NLP for All," we will explore the fundamental concepts behind NLP: From understanding how computers interpret human language; to discovering how to improve search queries, use regular expressions, find datasets, and learn about pipelines for working with language. Whether you're curious about chatbots, voice assistants, or automated text transcription and analysis, this series will demystify popular technologies and show you how they work.
What We Will Cover:
- Foundations of NLP: Gain a solid grasp of NLP concepts and terminology without needing a technical background.
- Real-World Applications: Explore practical uses of NLP in various contexts, such as improving search and information retrieval, generating and evaluating automatic transcriptions, and working with popular libraries such as spaCy, PyTorch and scikit-learn.
- Hands-On Experience: We will illustrate NLP concepts in action with a well-documented code notebook, aimed at solving practical examples. We will also explore online sources for NLP tools and datasets, such as HuggingFace.
Pre-requisites:
- A Google account to run Google Colab (where we will do most of our programming exercises)
- Basic knowledge of Python. You can brush up python fundamentals with Software Carpentry's Introduction to Python (section 1).
SERIES: Natural Language Processing for All
Add the Series to your calendar. Add to calendar
When: Thursdays, 3-4 pm, Sept. 5 - Oct. 24, 2024
Where: Weaver Science-Engineering Library, Rm 212 and on Zoom
Instructors: Megh Krishnaswamy
YouTube: The video links would be posted here.
09/12 Regular Expressions for NLP - YouTube
09/19 NLP with Transformers - YouTube
10/10 Text pre-processing for NLP - YouTube