Difference Between Iob Accuracy And Precision
I'm doing some works on NLTK with named entity recognition and chunkers. I retrained a classifier using nltk/chunk/named_entity.py for that and I got the following mesures: ChunkPa
Solution 1:
There is a very detailed explanation of the difference between precision and accuracy on wikipedia (see https://en.wikipedia.org/wiki/Accuracy_and_precision), in brief:
accuracy = (tp + tn) / (tp + tn + fp + fn)
precision = tp / tp + fp
Back to NLTK, there is a module call ChunkScore that computes the accuracy
, precision
and recall
of your system. And here's the funny part the way NLTK calculates the tp,fp,tn,fn
for accuracy
and precision
, it does at different granularity.
For accuracy, NLTK calculates the total number of tokens (NOT CHUNKS!!) that are guessed correctly with the POS tags and IOB tags, then divided by the total number of tokens in the gold sentence.
accuracy = num_tokens_correct / total_num_tokens_from_gold
For precision and recall, NLTK calculates the:
True Positives
by counting the number of chunks (NOT TOKENS!!!) that are guessed correctlyFalse Positives
by counting the number of chunks (NOT TOKENS!!!) that are guessed but they are wrong.True Negatives
by counting the number of chunks (NOT TOKENS!!!) that are not guessed by the system.
And then calculates the precision and recall as such:
precision = tp / fp + tp
recall = tp / fn + tp
To prove the above points, try this script:
from nltk.chunk import *
from nltk.chunk.util import *
from nltk.chunk.regexp import *
from nltk import Tree
from nltk.tag import pos_tag
# Let's say we give it a rule that says anything with a [DT NN] is an NP
chunk_rule = ChunkRule("<DT>?<NN.*>", "DT+NN* or NN* chunk")
chunk_parser = RegexpChunkParser([chunk_rule], chunk_node='NP')
# Let's say our test sentence is:# "The cat sat on the mat the big dog chewed."
gold = tagstr2tree("[ The/DT cat/NN ] sat/VBD on/IN [ the/DT mat/NN ] [ the/DT big/JJ dog/NN ] chewed/VBD ./.")
# We POS tag the sentence and then chunk with our rule-based chunker.
test = pos_tag('The cat sat on the mat the big dog chewed .'.split())
chunked = chunk_parser.parse(test)
# Then we calculate the score.
chunkscore = ChunkScore()
chunkscore.score(gold, chunked)
chunkscore._updateMeasures()
# Our rule-based chunker says these are chunks.
chunkscore.guessed()
# Total number of tokens from test sentence. i.e.# The/DT , cat/NN , on/IN , sat/VBD, the/DT , mat/NN , # the/DT , big/JJ , dog/NN , chewed/VBD , ./.
total = chunkscore._tags_total
# Number of tokens that are guessed correctly, i.e.# The/DT , cat/NN , on/IN , the/DT , mat/NN , chewed/VBD , ./.
correct = chunkscore._tags_correct
print"Is correct/total == accuracy ?", chunkscore.accuracy() == (correct/total)
print correct, '/', total, '=', chunkscore.accuracy()
print"##############"print"Correct chunk(s):"# i.e. True Positive.
correct_chunks = set(chunkscore.correct()).intersection(set(chunkscore.guessed()))
##print correct_chunksprint"Number of correct chunks = tp = ", len(correct_chunks)
assertlen(correct_chunks) == chunkscore._tp_num
printprint"Missed chunk(s):"# i.e. False Negative.##print chunkscore.missed()print"Number of missed chunks = fn = ", len(chunkscore.missed())
assertlen(chunkscore.missed()) == chunkscore._fn_num
printprint"Wrongly guessed chunk(s):"# i.e. False positive.
wrong_chunks = set(chunkscore.guessed()).difference(set(chunkscore.correct()))
##print wrong_chunksprint"Number of wrong chunks = fp =", len(wrong_chunks)
print chunkscore._fp_num
assertlen(wrong_chunks) == chunkscore._fp_num
printprint"Recall = ", "tp/fn+tp =", len(correct_chunks), '/', len(correct_chunks)+len(chunkscore.missed()),'=', chunkscore.recall()
print"Precision =", "tp/fp+tp =", len(correct_chunks), '/', len(correct_chunks)+len(wrong_chunks), '=', chunkscore.precision()
Post a Comment for "Difference Between Iob Accuracy And Precision"