help@rskworld.in +91 93305 39277
RSK World
  • Home
  • Development
    • Web Development
    • Mobile Apps
    • Software
    • Games
    • Project
  • Technologies
    • Data Science
    • AI Development
    • Cloud Development
    • Blockchain
    • Cyber Security
    • Dev Tools
    • Testing Tools
  • About
  • Contact

Theme Settings

Color Scheme
Display Options
Font Size
100%

Named Entity Recognition

Comprehensive Named Entity Recognition Dataset with labeled entities including persons, organizations, locations, dates, and other entity types. Includes Python scripts for NER models, spaCy, NLTK, Transformers, BIO format, interactive demo, and complete documentation. Perfect for named entity recognition, information extraction, NLP, and machine learning projects.

Named Entity Recognition Information Extraction BIO Format Download spaCy & NLTK Python Scripts Transformers NLP
Download Free Source Code Live Demo RSK View Files
Named Entity Recognition Dataset - RSK World
Named Entity Recognition Dataset - RSK World
Named Entity Recognition Information Extraction BIO Format spaCy & NLTK Python Transformers

This project features a comprehensive Named Entity Recognition Dataset designed for professional NER systems, information extraction, NLP, and machine learning applications. The dataset includes labeled entities including persons, organizations, locations, dates, and other entity types in BIO format. Includes powerful Python scripts: examples for NER models, spaCy, NLTK, Transformers, entity visualization, interactive demo, and complete documentation. Also includes interactive demo website. The package includes interactive demo website, comprehensive README.md, and MIT License. Perfect for NLP researchers, data scientists, students, and developers working on named entity recognition, information extraction, NLP, and machine learning projects.

If you find this Named Entity Recognition Dataset useful, you can support with a small contribution.

Secure Fast Trusted
Pay via UPI QR
Scan or tap an amount to auto-generate
UPI QR
₹
Open UPI app
GPay PhonePe Paytm
Download Free Source Code

Dataset Overview

Complete Named Entity Recognition Dataset with labeled entities including persons, organizations, locations, dates, and other entity types in BIO format for NER systems, information extraction, and NLP applications.

  • Labeled entities - Comprehensive labeled entities for NER tasks
  • Multiple entity types - PERSON, ORG, LOC, DATE, MONEY, PERCENT
  • BIO format - Standard BIO tagging format for sequence labeling
  • BIO format - Standard BIO format for compatibility
  • NER ready - Preprocessed data ready for NER models
  • Information extraction - Perfect for information extraction tasks
  • Multiple formats - JSON, CSV, BIO formats supported
  • Multiple Python scripts included for spaCy, NLTK, Transformers
  • Perfect for named entity recognition, information extraction, NLP & machine learning applications

Dataset Structure & Files

Well-organized project structure with labeled entities, Python scripts for NER models, spaCy, NLTK, Transformers, and interactive demo.

  • train.json - Training dataset with labeled entities
  • test.json - Test dataset with labeled entities
  • train.csv - CSV format training dataset
  • test.csv - CSV format test dataset
  • train_bio.txt - BIO format training dataset
  • scripts/load_dataset.py - Dataset loading script
  • scripts/train_model.py - NER model training script
  • scripts/visualize_ner.py - Entity visualization script
  • scripts/batch_process.py - Batch processing script
  • scripts/api_server.py - Flask API server
  • scripts/advanced_stats.py - Statistics and analytics
  • scripts/export_data.py - Multi-format export
  • scripts/evaluate_model.py - Model evaluation
  • index.html - Interactive demo website
  • demo.html - Interactive NER demo
  • README.md - Comprehensive project documentation
  • requirements.txt - Python dependencies (spacy, nltk, transformers)
  • LICENSE - MIT License file
  • .gitignore - Git ignore configuration
  • Consistent directory structure with train/test split
  • Easy to load with load_dataset.py script
  • Organized structure (dataset, scripts)
  • Label-based organization by entity type
  • Visualization with entity highlighting support
  • Complete preprocessing pipeline ready

Named Entity Recognition & Processing

Complete NER pipeline with support for spaCy, NLTK, Transformers, information extraction, and advanced NLP features.

  • spaCy Models - Use spaCy for named entity recognition tasks
  • NLTK Models - Use NLTK for entity extraction and tagging
  • Transformers - Leverage Hugging Face Transformers library
  • Information Extraction - Extract entities from text documents
  • BIO Format - Standard format for NER datasets
  • Text Processing - Process and tokenize text documents
  • Entity Tagging - Tag entities with BIO labels
  • Entity Extraction - Extract entities from text
  • Batch Processing - Process multiple documents efficiently
  • Model Training - Train NER models from dataset
  • Model Evaluation - Evaluate model performance on test set
  • Error Handling - Comprehensive error checking and informative messages
  • ML Ready - Preprocessed data for machine learning
  • Visualization Tools - Display entities and labels
  • Multiple Models - Support for spaCy, NLTK, and other transformer models
  • Data Export - Export entities and labels
  • Performance Optimized - Efficient batch operations and memory management

Data Formats & Compatibility

Dataset available in standard formats (JSON, CSV, BIO) for maximum compatibility with NLP libraries and ML frameworks.

  • JSON format - Standard JSON format for entities and labels
  • CSV format - CSV format for easy data manipulation
  • BIO format - Standard BIO tagging format for compatibility
  • NumPy array compatible - Easy conversion to numpy arrays for ML
  • Pandas ready - Direct loading with pandas DataFrame
  • Transformers compatible - Ready for Hugging Face Transformers
  • TensorFlow/PyTorch ready - Can be converted for deep learning models
  • Standard data formats - Widely supported JSON, CSV, and BIO formats
  • Easy to import and process - Simple data loading functions
  • Compatible with all ML libraries - Universal format support
  • Jupyter Notebook ready - Perfect for interactive NLP analysis
  • Python NLP processing ready - Native spacy, nltk, transformers support
  • spaCy/NLTK ready - Compatible with spaCy, NLTK, and other NLP libraries
  • NLP tools ready - Compatible with transformers, spacy, nltk
  • API integration ready - JSON format for NER results
  • Data validation support - Easy to validate data quality and format
  • NER ready - Compatible with spaCy and NLTK models
  • Information extraction ready - Real-time entity extraction from text

Analysis & Visualization

Comprehensive NER visualization tools with interactive viewer and analysis capabilities.

  • Interactive Entity Viewer - Entity display with highlighting
  • Multiple Entity Display - View entities with different types
  • Entity gallery - Browse through entities by type
  • Entity highlighting - Display entities highlighted in text
  • Entity comparison - Compare multiple entities side-by-side
  • Entity results visualization - Display entity results with confidence scores
  • Text visualization - Show text documents and entity spans
  • Entity-based filtering - Filter entities by type
  • Entity metadata display - Show entity type and position information
  • Entity quality highlighting - Highlight entity quality metrics
  • Dataset statistics - Comprehensive summary of NER dataset
  • Interactive entity viewer - Browse, search, and navigate entities
  • Entity distribution charts - Visualize entity type frequencies
  • Entity quality assessment - Display entity quality metrics
  • NER accuracy distribution - Show accuracy metrics
  • Entity preview grid - Grid view of entities by type
  • Export functionality - Download entities and labels
  • Responsive design - Works on desktop, tablet, and mobile devices

Compatible Frameworks

Works with all major NLP and deep learning frameworks out of the box.

  • Scikit-learn ML library - Classification, clustering, preprocessing
  • Transformer Models - BERT, RoBERTa, and other transformer models
  • Deep Learning - TensorFlow, PyTorch, Keras compatibility
  • Hugging Face Transformers - Transformers library support
  • NLP Processing - spaCy, NLTK for text processing
  • NumPy numerical computing - Array operations for embeddings
  • Text processing - Tokenization, encoding, and preprocessing
  • matplotlib visualization - Static visualization and plots
  • Natural Language Processing - Text analysis and processing
  • spaCy library - NLP and text processing support
  • Flask REST API - Web API server for NER services
  • NER frameworks - Compatible with spaCy, NLTK, Transformers
  • Jupyter Notebook support - Interactive NLP analysis
  • Google Colab ready - Works in cloud-based notebooks
  • VS Code integration - Python extension support
  • PyCharm compatible - Full IDE support
  • NER models - Custom models for named entity recognition
  • NLP tools - Information extraction and entity tagging support
  • Transfer learning ready - Pre-trained transformer models
  • Real-time processing - Real-time entity extraction support
  • REST APIs - HTTP API for NER services

What You Get

Complete package with all files needed for professional NER systems, information extraction, NLP, and machine learning projects.

  • Labeled entities - Entities with types (PERSON, ORG, LOC, DATE, MONEY, PERCENT)
  • Training data - Training dataset with labeled entities
  • Test data - Test dataset with labeled entities
  • Python NER scripts - Complete NER system
  • scripts/train_model.py - NER model training with support for BIO format
  • scripts/visualize_ner.py - Entity visualization script
  • scripts/batch_process.py - Batch processing script
  • scripts/api_server.py - Flask API server
  • Organized directory structure - Separate folders for dataset, scripts
  • index.html - Interactive demo website
  • demo.html - Interactive NER demo
  • Multiple data formats - JSON, CSV, BIO formats supported
  • Complete documentation - README.md, PROJECT_STATUS.md, QUICKSTART.md
  • Documentation files - Comprehensive guides and project information
  • requirements.txt - All Python dependencies listed and versioned (spacy, nltk, transformers)
  • LICENSE - MIT License (free for commercial and non-commercial use)
  • Ready-to-use code examples - Copy and run scripts immediately
  • Data-based organization - Separate files for train, test datasets
  • BIO format organization - Data organized in BIO format
  • NER pipeline - Ready-to-use NER functions
  • Visualization tools - Interactive entity viewer
  • ML ready - Preprocessed data for model training

Interactive Demo Website

Beautiful demo website with NER explorer, entity gallery, and comprehensive guide.

  • Modern animated design - Smooth transitions and visual effects
  • Interactive NER Explorer - Browse and view entities
  • Entity Gallery - Display entities with text and labels
  • Entity Viewer - Browse, search, and navigate entities
  • Entity Metrics - Visual representation of entity results
  • Filter by type - Filter entities by entity type
  • Entity visualization - Display entities with text and entity highlights
  • Entity distribution - Entity type-based breakdown
  • Dataset statistics display - Total entities, types, accuracy
  • Interactive entity display - Click to view full entity details
  • Step-by-step usage guide - Comprehensive instructions
  • Dark theme with gradients - Modern, professional appearance
  • Fully responsive layout - Mobile, tablet, and desktop support
  • Data export options - Download entities and labels
  • Python scripts download - Access to all NER scripts
  • Interactive filters - Filter by entity type, confidence
  • Entity detail view - Individual entity display with metadata
  • Statistics summary - Quick overview of dataset metrics
  • No backend required - Pure HTML, CSS, JavaScript
  • Cross-browser compatible - Works on Chrome, Firefox, Safari, Edge

Python Scripts Included

Professional Python scripts for NER, information extraction, preprocessing, visualization, and advanced NLP features.

  • scripts/train_model.py - Comprehensive NER model training script
  • scripts/visualize_ner.py - Entity visualization script
  • scripts/batch_process.py - Batch processing script
  • scripts/api_server.py - Flask API server for NER
  • scripts/advanced_stats.py - Statistics and analytics
  • scripts/export_data.py - Multi-format export (CSV, TSV, XML, BIO, CoNLL, JSONL)
  • scripts/evaluate_model.py - Model evaluation with metrics
  • scripts/load_dataset.py - Dataset loading utilities
  • Text processing functions - Process and tokenize text documents
  • Entity tagging functions - Tag entities with BIO labels
  • Entity extraction functions - Extract entities from text
  • spaCy model functions - Use spaCy for NER
  • NLTK model functions - Use NLTK for entity extraction
  • Transformers functions - Leverage Hugging Face Transformers library
  • Information extraction functions - Extract entities from documents
  • BIO format functions - Process BIO format data
  • Batch processing support - Process multiple documents efficiently
  • Model evaluation functions - Evaluate model performance
  • Dataset verification - Data format checking, validation, and quality assessment
  • Export functionality - Export entities and labels
  • Error handling - Comprehensive error checking and informative messages
  • Code comments and documentation - Well-documented code for learning
  • Complete code examples - Ready-to-run scripts with examples
  • Modular design - Reusable functions for different NER tasks
  • Best practices - Follows Python coding standards (PEP 8)
  • Real-time NER - Real-time entity extraction support

Dataset Features

Comprehensive Named Entity Recognition Dataset with labeled entities in BIO format for NER, information extraction, and NLP applications.

  • Multiple Entity Types - PERSON, ORG, LOC, DATE, MONEY, PERCENT
  • Various Entities - Different entity types for comprehensive training
  • Entity Formats - Various entity formats for real-world applications
  • BIO Format - Standard BIO tagging format for compatibility
  • Data Formats - JSON, CSV, BIO formats supported
  • Organized Structure - Separate files for train, test datasets
  • Multiple Data Types - Training and test datasets
  • Entity Organization - Entities organized by type
  • High-quality Data - Clean, validated, and consistent entity labels
  • Complete Dataset - Text with corresponding entity labels
  • Ready for machine learning - Preprocessed data for model training
  • NER Ready - Pre-labeled data for NER tasks
  • NER utilities - Pre-built NER functions
  • Easy to extend dataset - Add more entities or text
  • Organized project structure - Clear directory organization
  • Data-based organization - Separate files for train, test datasets
  • Entity-based annotations - Structured entity and label information
  • Entity metadata - Entity type and position information
  • NLP standards - Follows NER best practices
  • Sample data included - Sample entities and labels
  • Production ready - Tested and verified NER system

Credits & Acknowledgments

This dataset is provided for educational and research purposes. Core technologies and libraries are credited below.

  • Python 3.8+ - Programming language (PSF License)
  • Scikit-learn - Machine learning library (BSD License)
  • XGBoost - Gradient boosting framework (Apache 2.0)
  • NumPy - Numerical computing (BSD License)
  • pandas - Data manipulation (BSD License)
  • matplotlib - Data Visualization (PSF License)
  • RSK World - Dataset creator and provider
  • GitHub Repository - Source code and releases
  • Author: Molla Sameer | Designer: Rima Khatun
  • MIT License - Free for learning & research

Support & Contact

For commercial use, custom datasets, or integration help, please contact us.

  • Email: help@rskworld.in
  • Phone: +91 93305 39277
  • Website: RSKWORLD.in
  • Location: Nutanhat, Mongolkote, West Bengal, India
  • Author: Molla Sameer
  • Designer & Tester: Rima Khatun
  • GitHub: Coming Soon
  • Named Entity Recognition Dataset Documentation
  • Technical Support Available
  • Custom Dataset Requests Welcome
Featured Content
Additional Sponsored Content

Download Free Source Code

Get the complete Named Entity Recognition dataset bundle. You can view the files or download the dataset directly.

Download Free Source Code

Quick Links

Live Demo - Try Named Entity Recognition Dataset Click to explore
Download Free Source Code Click to explore
View Files (Browser) Click to explore
Explore All Dataset Projects by RSK World Click to explore
Explore All Data Science Projects by RSK World Click to explore

Categories

Named Entity Recognition Information Extraction BIO Format spaCy & NLTK Python Transformers

Technologies

Named Entity Recognition
Information Extraction
BIO Format
Python
Machine Learning

Explore More Datasets

NLP & Named Entity Recognition

Dataset Learning Dataset Computer Vision Python Image Classification
Face Recognition Dataset - rskworld.in
Face Recognition Dataset
Image Data

Facial recognition dataset with labeled face images across multiple identities f...

View Project
Customer Churn Dataset - rskworld.in
Customer Churn Dataset
Tabular Data

Comprehensive customer churn dataset with demographic, usage, and billing inform...

View Project
Sales Forecasting Dataset - rskworld.in
Sales Forecasting Dataset
Tabular Data

Complete sales dataset with historical sales data, product information, and time...

View Project
Medical Imaging Dataset - rskworld.in
Medical Imaging Dataset
Image Data

Medical image dataset with X-rays, CT scans, and MRI images with diagnostic labe...

View Project
Text Classification Dataset - rskworld.in
Text Classification Dataset
Text Data

Multi-class text classification dataset with labeled documents for news categori...

View Project
View All Projects

About RSK World

Founded by Molla Samser, with Designer & Tester Rima Khatun, RSK World is your one-stop destination for free programming resources, source code, and development tools.

Founder: Molla Samser
Designer & Tester: Rima Khatun

Development

  • Game Development
  • Web Development
  • Mobile Development
  • AI Development
  • Development Tools

Legal

  • Terms & Conditions
  • Privacy Policy
  • Disclaimer

Contact Info

Nutanhat, Mongolkote
Purba Burdwan, West Bengal
India, 713147

+91 93305 39277

hello@rskworld.in
support@rskworld.in

© 2026 RSK World. All rights reserved.

Content used for educational purposes only. View Disclaimer

Support This Free Project

This project is completely free to download!

If you find it useful, consider supporting us with a small donation. Your support helps us create more free projects.

Pay via Razorpay

If you find this Named Entity Recognition Dataset useful, you can support with a small contribution.

Secure Fast Trusted
Payment Successful! Your download will start automatically...
Pay via UPI QR
Scan or tap an amount to auto-generate
UPI QR
₹
Open UPI app
GPay PhonePe Paytm