πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Part-of-Speech Tagging

Sequence LabelingPOS Tagging🟒 Free Lesson

Advertisement

Part-of-Speech Tagging

POS tagging assigns grammatical categories (noun, verb, adjective, etc.) to each word in a sentence. It's a fundamental NLP task that provides syntactic information used in parsing, information extraction, and sentiment analysis.

Common POS Tag Sets

Penn Treebank Tags

TagDescriptionExample
NNNoun, singulardog, city
NNSNoun, pluraldogs, cities
NNPProper nounJohn, London
VBVerb, base formrun, eat
VBDVerb, past tenseran, ate
VBGVerb, gerundrunning, eating
JJAdjectivebig, red
RBAdverbquickly, very
DTDeterminerthe, a
INPrepositionin, on, at
PRPPersonal pronounI, he, she
CCCoordinating conjunctionand, but, or
import spacy

nlp = spacy.load("en_core_web_sm")
doc = nlp("The quick brown fox jumps over the lazy dog")

for token in doc:
    print(f"{token.text:12} {token.pos_:8} {token.tag_:5} {spacy.explain(token.tag_)}")
# The          DET      det    determiner
# quick        ADJ      amod   adjective
# brown        ADJ      amod   adjective
# fox          NOUN     nn     noun
# jumps        VERB     ROOT   verb
# over         ADP      prep   preposition
# the          DET      det    determiner
# lazy         ADJ      amod   adjective
# dog          NOUN     pobj   noun

Hidden Markov Model (HMM) Tagger

HMM POS Tagging

t^=arg⁑max⁑tP(t∣w)=arg⁑max⁑tP(w∣t)Γ—P(t)\hat{t} = \arg\max_t P(t|w) = \arg\max_t P(w|t) \times P(t)
# HMM transition example
# P(VB | DT) = C(DT, VB) / C(DT)
# P(NN | DT) = C(DT, NN) / C(DT)

# Emission probability
# P("dog" | NN) = C("dog" as NN) / C(NN)

Viterbi Algorithm

The Viterbi algorithm efficiently finds the most likely tag sequence.

def viterbi(observations, states, start_p, trans_p, emit_p):
    V = [{}]
    path = {}

    # Initialize
    for state in states:
        V[0][state] = start_p[state] * emit_p[state].get(observations[0], 0)
        path[state] = [state]

    # Run Viterbi
    for t in range(1, len(observations)):
        V.append({})
        newpath = {}
        for state in states:
            prob, prev_state = max(
                (V[t-1][prev] * trans_p[prev].get(state, 0) *
                 emit_p[state].get(observations[t], 0), prev)
                for prev in states
            )
            V[t][state] = prob
            newpath[state] = path[prev_state] + [state]
        path = newpath

    # Find best final state
    prob, state = max((V[len(observations)-1][s], s) for s in states)
    return prob, path[state]

Rule-Based Tagging

# Simple rule-based tagger
rules = [
    (r'\b(is|are|was|were|be|been)\b', 'VB'),
    (r'\b(the|a|an)\b', 'DT'),
    (r'\b\w+ing\b', 'VBG'),    # -ing ending
    (r'\b\w+ed\b', 'VBD'),     # -ed ending
    (r'\b\w+ly\b', 'RB'),      # -ly ending
    (r'\b\w+ous\b', 'JJ'),     # -ous ending
    (r'\b[A-Z][a-z]+\b', 'NNP'), # Capitalized = proper noun
]

Training a POS Tagger

import nltk
from nltk.tag import UnigramTagger, BigramTagger
from nltk.corpus import treebank

nltk.download('treebank')

train_data = treebank.tagged_sents()[:3000]
test_data = treebank.tagged_sents()[3000:]

# Unigram tagger
unigram_tagger = UnigramTagger(train_data)
print(f"Unigram accuracy: {unigram_tagger.accuracy(test_data):.3f}")

# Bigram tagger
bigram_tagger = BigramTagger(train_data, backoff=unigram_tagger)
print(f"Bigram accuracy: {bigram_tagger.accuracy(test_data):.3f}")

POS Tagging Applications

ApplicationHow POS Helps
LemmatizationPOS-based lemmatization improves accuracy
Named Entity RecognitionNouns are likely entity candidates
Sentiment AnalysisAdjectives often indicate sentiment
Information ExtractionVerbs indicate actions/relations
Machine TranslationWord order varies by POS across languages
⭐

Premium Content

Part-of-Speech Tagging

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert NLP Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement