Classifier
Text Analysis currently offers a Naive Bayes Classifier for text classification.
To load the Naive Bayes Classifier, use the following command -
using TextAnalysis: NaiveBayesClassifier, fit!, predict
Basic Usage
Its usage can be done in the following 3 steps.
1- Create an instance of the Naive Bayes Classifier model -
model = NaiveBayesClassifier(dict, classes)
It takes two arguments-
classes
: An array of possible classes that the concerned data could belong to.dict
:(Optional Argument) An Array of possible tokens (words). This is automatically updated if a new token is detected in the Step 2) or 3)
2- Fitting the model weights on input -
fit!(model, str, class)
3- Predicting for the input case -
predict(model, str)
Example
julia> m = NaiveBayesClassifier([:legal, :financial])
NaiveBayesClassifier{Symbol}(String[], Symbol[:legal, :financial], Array{Int64}(0,2))
julia> fit!(m, "this is financial doc", :financial)
NaiveBayesClassifier{Symbol}(["financial", "this", "is", "doc"], Symbol[:legal, :financial], [1 2; 1 2; 1 2; 1 2])
julia> fit!(m, "this is legal doc", :legal)
NaiveBayesClassifier{Symbol}(["financial", "this", "is", "doc", "legal"], Symbol[:legal, :financial], [1 2; 2 2; … ; 2 2; 2 1])
julia> predict(m, "this should be predicted as a legal document")
Dict{Symbol,Float64} with 2 entries:
:legal => 0.666667
:financial => 0.333333