PII Data Detection

Mon, 15 Jan 2024 00:00:00 +0000

Overview

Detecting personally identifiable information (PII) in student essays using named entity recognition. The challenge required identifying and classifying PII tokens across thousands of documents with high precision and recall.

Approach

Synthetic data generation to augment limited training data with realistic PII patterns
DeBERTa-based NER models fine-tuned for token-level PII classification
ONNX optimization for inference speed without sacrificing accuracy
Ensemble strategy combining multiple model checkpoints

Result

121/2048 🥉

Public leaderboard: 0.970
Private leaderboard: 0.956

LLM Science Exam

Sun, 15 Oct 2023 00:00:00 +0000

Overview

Answering difficult science questions using large language models. The competition required selecting correct answers from multiple-choice science questions spanning physics, chemistry, biology, and other domains.

Result

194/2664 🥉

Nlp on Galliard7

PII Data Detection

Overview

Approach

Result

Links

LLM Science Exam

Overview

Result

Links