help@rskworld.in +91 93305 39277
RSK World
  • Home
  • Development
    • Web Development
    • Mobile Apps
    • Software
    • Games
    • Project
  • Technologies
    • Data Science
    • AI Development
    • Cloud Development
    • Blockchain
    • Cyber Security
    • Dev Tools
    • Testing Tools
  • About
  • Contact

Theme Settings

Color Scheme
Display Options
Font Size
100%
Back to Project
RSK World
nlp-text-analysis-bot
RSK World
nlp-text-analysis-bot
NLP Text Analysis Bot - Python + NLP + Flask + Machine Learning + Text Analysis + AI
nlp-text-analysis-bot
  • static
  • templates
  • .gitignore393 B
  • ADVANCED_FEATURES.md5.4 KB
  • CHANGELOG.md1.3 KB
  • FINAL_CHECK.md4.6 KB
  • GITHUB_RELEASE_INSTRUCTIONS.md4.1 KB
  • LICENSE1.2 KB
  • PROJECT_INFO.md2.7 KB
  • PROJECT_STATUS.md4 KB
  • QUICKSTART.md3.1 KB
  • README.md5.8 KB
  • RELEASE_NOTES.md3.8 KB
  • advanced_keywords.py3.9 KB
  • app.py3 KB
  • config.py668 B
  • emotion_detection.py4.3 KB
  • entity_recognition.py3 KB
  • example_usage.py2.7 KB
  • install.bat853 B
  • install.sh808 B
  • language_detection.py2.7 KB
  • nlp_pipeline.py7.1 KB
  • pos_tagging.py2.9 KB
  • readability_analysis.py3.5 KB
  • requirements.txt334 B
  • semantic_understanding.py4 KB
  • sentiment_analysis.py3.9 KB
  • setup.py1.4 KB
  • test_analysis.py2.5 KB
  • text_classification.py5 KB
  • text_preprocessing.py4.2 KB
  • text_similarity.py4.1 KB
  • text_summarization.py5 KB
  • validate_project.py4.2 KB
README.md
README.md
Raw Download

README.md

# NLP Text Analysis Bot

**Developer: RSK World**
**Website: https://rskworld.in**
**Email: help@rskworld.in**
**Phone: +91 93305 39277**
**Year: 2026**

## Overview

An advanced Natural Language Processing chatbot that provides comprehensive text analysis capabilities including sentiment detection, entity recognition, semantic understanding, and text preprocessing.

## Features

### Core Features
- **Text Preprocessing**: Cleaning, tokenization, stopword removal, and lemmatization
- **Sentiment Analysis**: Multi-method sentiment detection using VADER and transformer models
- **Entity Recognition**: Named entity extraction using spaCy
- **Semantic Understanding**: Keyword extraction, topic identification, and phrase analysis
- **NLP Pipeline**: Complete end-to-end text analysis workflow

### Advanced Features
- **Text Summarization**: Extractive and abstractive summarization using transformer models
- **Language Detection**: Automatic language detection with confidence scores
- **Text Classification**: Zero-shot text classification into multiple categories
- **Emotion Detection**: Advanced emotion analysis beyond basic sentiment (joy, sadness, anger, fear, etc.)
- **Readability Analysis**: Multiple readability metrics (Flesch, SMOG, Coleman-Liau, etc.)
- **Advanced Keyword Extraction**: TF-IDF based keyword extraction with n-grams
- **Part-of-Speech Tagging**: Complete POS analysis with distribution statistics
- **Text Similarity**: Calculate similarity between texts using multiple methods

## Technologies

- **NLTK**: Natural Language Toolkit for text processing
- **spaCy**: Advanced NLP library for entity recognition and semantic analysis
- **Python**: Core programming language
- **Transformers**: Hugging Face transformers for advanced NLP tasks
- **Flask**: Web framework for API and interface
- **scikit-learn**: Machine learning library for TF-IDF and similarity calculations
- **langdetect**: Language detection library
- **textstat**: Readability analysis library

## Installation

1. **Clone or download the project**

2. **Install Python dependencies:**
```bash
pip install -r requirements.txt
```

3. **Download spaCy English model:**
```bash
python -m spacy download en_core_web_sm
```

4. **Download NLTK data (automatically handled, but can be done manually):**
```python
import nltk
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')
nltk.download('vader_lexicon')
nltk.download('averaged_perceptron_tagger')
```

## Usage

### Running the Web Application

1. **Start the Flask server:**
```bash
python app.py
```

2. **Open your browser and navigate to:**
```
http://localhost:5000
```

3. **Enter text in the input field and click "Analyze Text"**

### Using the API

**Endpoint:** `POST /api/analyze`

**Request:**
```json
{
"text": "Your text to analyze here"
}
```

**Response:**
```json
{
"original_text": "...",
"preprocessing": {...},
"sentiment": {...},
"entities": {...},
"semantic": {...},
"summary": {...}
}
```

### Using as a Python Module

```python
from nlp_pipeline import NLPPipeline

# Initialize pipeline
pipeline = NLPPipeline()

# Analyze text
results = pipeline.analyze("Your text here")

# Access results
print(results['sentiment'])
print(results['entities'])
print(results['semantic'])
```

## Project Structure

```
nlp-text-analysis-bot/
├── app.py # Flask web application
├── nlp_pipeline.py # Main NLP pipeline orchestrator
├── text_preprocessing.py # Text cleaning and preprocessing
├── sentiment_analysis.py # Sentiment analysis module
├── entity_recognition.py # Named entity recognition
├── semantic_understanding.py # Semantic analysis module
├── templates/
│ └── index.html # Web interface
├── requirements.txt # Python dependencies
└── README.md # This file
```

## API Endpoints

- `GET /` - Web interface
- `POST /api/analyze` - Complete text analysis endpoint (all features)
- `POST /api/similarity` - Text similarity comparison endpoint
- `GET /api/health` - Health check endpoint

### API Usage Examples

**Text Analysis:**
```json
POST /api/analyze
{
"text": "Your text to analyze here"
}
```

**Text Similarity:**
```json
POST /api/similarity
{
"text1": "First text",
"text2": "Second text"
}
```

## Example Analysis Output

The comprehensive analysis provides:
- **Text Statistics**: Word count, sentence count, vocabulary richness
- **Language Detection**: Detected language with confidence scores
- **Sentiment Scores**: Overall sentiment with detailed breakdown
- **Emotion Detection**: Primary emotion and emotion distribution
- **Named Entities**: People, organizations, locations, etc.
- **Text Classification**: Category classification with confidence
- **Text Summarization**: Extractive or abstractive summary
- **Readability Metrics**: Multiple readability scores and grade levels
- **Advanced Keywords**: TF-IDF keywords, bigrams, and trigrams
- **POS Analysis**: Part-of-speech distribution and statistics
- **Keywords & Topics**: Main themes and important terms
- **Preprocessing Details**: Cleaned text and tokenization results

## Requirements

- Python 3.8+
- 4GB+ RAM recommended (for transformer models)
- Internet connection (for downloading models on first run)

## License

This project is provided by RSK World for educational and development purposes.

## Support

For support, contact:
- **Website**: https://rskworld.in
- **Email**: help@rskworld.in
- **Phone**: +91 93305 39277

---

**Developed by RSK World - 2026**

About RSK World

Founded by Molla Samser, with Designer & Tester Rima Khatun, RSK World is your one-stop destination for free programming resources, source code, and development tools.

Founder: Molla Samser
Designer & Tester: Rima Khatun

Development

  • Game Development
  • Web Development
  • Mobile Development
  • AI Development
  • Development Tools

Legal

  • Terms & Conditions
  • Privacy Policy
  • Disclaimer

Contact Info

Nutanhat, Mongolkote
Purba Burdwan, West Bengal
India, 713147

+91 93305 39277

hello@rskworld.in
support@rskworld.in

© 2026 RSK World. All rights reserved.

Content used for educational purposes only. View Disclaimer