SetFit with thenlper/gte-base

This is a SetFit model that can be used for Text Classification. This SetFit model uses thenlper/gte-base as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

  • Model Type: SetFit
  • Sentence Transformer body: thenlper/gte-base
  • Classification head: a LogisticRegression instance
  • Maximum Sequence Length: 512 tokens
  • Number of Classes: 2 classes

Model Sources

Model Labels

Label Examples
True
  • 'Tech Giant A has acquired Startup B in a groundbreaking deal valued at $500 million, aiming to enhance its product offerings.'
  • 'Industry leaders D and E announced their merger today, combining resources to capture more market share in the competitive landscape.'
  • 'International Firm I and Domestic Firm J have finalized an acquisition deal that is expected to reshape the industry.'
False
  • 'In an unexpected move, Company C has invested heavily in a new technology initiative, signaling its commitment to innovation.'
  • 'A recent survey indicates that Company F is planning to introduce new features next quarter, boosting user engagement.'
  • 'Company G has entered a partnership with Organization H, focusing on joint product development and marketing strategies.'

Evaluation

Metrics

Label Accuracy
all 0.93

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("amplyfi/merger-and-acquisition")
# Run inference
preds = model("The government has announced new regulations on corporate mergers and acquisitions, affecting multiple industries.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 9 14.4496 25
Label Training Sample Count
False 243
True 213

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (3, 3)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 5
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0035 1 0.1828 -
0.1754 50 0.2934 -
0.3509 100 0.0797 -
0.5263 150 0.0108 -
0.7018 200 0.0013 -
0.8772 250 0.0007 -
1.0526 300 0.0003 -
1.2281 350 0.0002 -
1.4035 400 0.0002 -
1.5789 450 0.0002 -
1.7544 500 0.0002 -
1.9298 550 0.0002 -
2.1053 600 0.0001 -
2.2807 650 0.0001 -
2.4561 700 0.0001 -
2.6316 750 0.0001 -
2.8070 800 0.0001 -
2.9825 850 0.0001 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.1.0
  • Sentence Transformers: 3.3.1
  • Transformers: 4.42.2
  • PyTorch: 2.5.1+cu124
  • Datasets: 3.1.0
  • Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
10
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for amplyfi/merger-and-acquisition

Base model

thenlper/gte-base
Finetuned
(11)
this model

Evaluation results