Pre-trained BERT on Twitter US Political Election 2020
Pre-trained weights for Knowledge Enhance Masked Language Model for Stance Detection, NAACL 2021.
We use the initialized weights from BERT-base (uncased) or bert-base-uncased
.
Training Data
This model is pre-trained on over 5 million English tweets about the 2020 US Presidential Election.
Training Objective
This model is initialized with BERT-base and trained with normal MLM objective.
Usage
This pre-trained language model can be fine-tunned to any downstream task (e.g. classification).
Please see the official repository for more detail.
from transformers import BertTokenizer, BertForMaskedLM, pipeline
import torch
# Choose GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Select mode path here
pretrained_LM_path = "kornosk/bert-political-election2020-twitter-mlm"
# Load model
tokenizer = BertTokenizer.from_pretrained(pretrained_LM_path)
model = BertForMaskedLM.from_pretrained(pretrained_LM_path)
# Fill mask
example = "Trump is the [MASK] of USA"
fill_mask = pipeline('fill-mask', model=model, tokenizer=tokenizer)
# Use following line instead of the above one does not work.
# Huggingface have been updated, newer version accepts a string of model name instead.
fill_mask = pipeline('fill-mask', model=pretrained_LM_path, tokenizer=tokenizer)
outputs = fill_mask(example)
print(outputs)
# See embeddings
inputs = tokenizer(example, return_tensors="pt")
outputs = model(**inputs)
print(outputs)
# OR you can use this model to train on your downstream task!
# Please consider citing our paper if you feel this is useful :)
Reference
Citation
@inproceedings{kawintiranon2021knowledge,
title={Knowledge Enhanced Masked Language Model for Stance Detection},
author={Kawintiranon, Kornraphop and Singh, Lisa},
booktitle={Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
year={2021},
publisher={Association for Computational Linguistics},
url={https://www.aclweb.org/anthology/2021.naacl-main.376}
}
- Downloads last month
- 34
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.