Adjectives are words that modify nouns. An adjective generally occurs in two forms, an undeclined one and a declined one, ending in -e. A good description of the rules I found at Wikibooks.

Adjectives are also modified to form comparatives and superlatives. For example:
goed - beter -best.
Such special cases may be added as rules like:
beter=>goed
best=>goed
Ordinal numbers may also be considered as adjectives, so the lemmatizer should propose the cardinal number of any ordinal number. For example:
drie - derde.
In most cases, de cardinal number is found by deleting the end -de or -ste of the ordinal number.
For example:
twee - tweede
twintig - twintigste
Since this rule may only be applied for ordinal numbers, a list of cardinal numbers can be maintained. This list should not be too long since it is sufficient to cover the cardinal numbers on which the lemma might end.
Next, I will add support for Dutch adverbs and nouns in the lemmatizer.
Geen opmerkingen:
Een reactie posten