oliverguhr/german-sentiment-bert

1年前发布 3 00

German Sentiment Classifica...

收录时间:
2025-05-30
oliverguhr/german-sentiment-bertoliverguhr/german-sentiment-bert

German Sentiment Classification with Bert

This model was trained for sentiment classification of German language texts. To achieve the best results all model inputs needs to be preprocessed with the same procedure, that was applied during the training. To simplify the usage of the model,
we provide a Python package that bundles the code need for the preprocessing and inferencing.
The model uses the Googles Bert architecture and was trained on 1.834 million German-language samples. The training data contains texts from various domains like Twitter, Facebook and movie, app and hotel reviews.
You can find more information about the dataset and the training process in the paper.

Using the Python package

To get started install the package from pypi:
pip install germansentiment

from germansentiment import SentimentModel
model = SentimentModel()
texts = [
"Mit keinem guten Ergebniss","Das ist gar nicht mal so gut",
"Total awesome!","nicht so schlecht wie erwartet",
"Der Test verlief positiv.","Sie fährt ein grünes Auto."]
result = model.predict_sentiment(texts)
print(result)

The code above will output following list:
["negative","negative","positive","positive","neutral", "neutral"]

Output class probabilities

from germansentiment import SentimentModel
model = SentimentModel()
classes, probabilities = model.predict_sentiment(["das ist super"], output_probabilities = True)
print(classes, probabilities)

['positive'] [[['positive', 0.9761366844177246], ['negative', 0.023540444672107697], ['neutral', 0.00032294404809363186]]]

Model and Data

If you are interested in code and data that was used to train this model please have a look at this repository and our paper. Here is a table of the F1 scores that this model achieves on different datasets. Since we trained this model with a newer version of the transformer library, the results are slightly better than reported in the paper.

DatasetF1 micro Score
holidaycheck0.9568
scare0.9418
filmstarts0.9021
germeval0.7536
PotTS0.6780
emotions0.9649
sb10k0.7376
Leipzig Wikipedia Corpus 20160.9967
all0.9639

数据统计

相关导航

没有相关内容!

暂无评论

您必须登录才能参与评论!
立即登录
none
暂无评论...